| Speech synthesis apparatus and method, and storage medium -> Monitor Keywords |
|
Speech synthesis apparatus and method, and storage mediumUSPTO Application #: 20060085194Title: Speech synthesis apparatus and method, and storage medium Abstract: Input text data undergoes language analysis to generate prosody, and a speech database is searched for a synthesis unit on the basis of the prosody. A modification distortion of the found synthesis unit, and concatenation distortions upon connecting that synthesis unit to those in the preceding phoneme are computed, and a distortion determination unit weights the modification and concatenation distortions to determine the total distortion. An Nbest determination unit obtains N best paths that can minimize the distortion using the A* search algorithm, and a registration unit determination unit selects a synthesis unit to be registered in a synthesis unit inventory on the basis of the N best paths in the order of frequencies of occurrence, and registers it in the synthesis unit inventory. (end of abstract) Agent: Fitzpatrick Cella Harper & Scinto - New York, NY, US Inventors: Yasuo Okutani, Yasuhiro Komori USPTO Applicaton #: 20060085194 - Class: 704258000 (USPTO) Related Patent Categories: Data Processing: Speech Signal Processing, Linguistics, Language Translation, And Audio Compression/decompression, Speech Signal Processing, Synthesis The Patent Description & Claims data below is from USPTO Patent Application 20060085194. Brief Patent Description - Full Patent Description - Patent Application Claims FIELD OF THE INVENTION [0001] The present invention relates to a speech synthesis apparatus and method for forming a synthesis unit inventory used in speech synthesis, and a storage medium. BACKGROUND OF THE INVENTION [0002] In speech synthesis apparatuses that produce synthetic speech on the basis of text data, a speech synthesis method which pastes and modifies synthesis units at desired pitch intervals while copying and/or deleting them in units of pitch waveforms (PSOLA: Pitch Synchronous Overlap and Add), and produces synthetic speech by concatenating these synthesis units is becoming popular today. [0003] Synthetic speech produced by exploiting such technique contains a distortion due to modifying of synthesis units (to be referred to as a modification distortion hereinafter) and a distortion due to concatenations of synthesis units (to be referred to as a concatenation distortion hereinafter). Such two different distortions seriously cause deterioration of the quality of synthetic speech. When the number of synthesis units that can be registered in a synthesis unit inventory is limited, it is nearly impossible to select synthesis units which reduce such distortions. Especially, when only one synthesis unit can be registered in a synthesis unit inventory in correspondence with one phonetic environment, it is totally impossible to select synthesis units which reduce the distortions. If such synthesis unit inventory is used, the quality of synthetic speech deteriorates inevitably due to the modification and concatenation distortions. SUMMARY OF THE INVENTION [0004] The present invention has been made in consideration of the aforementioned prior art, and has as its object to provide a speech synthesis apparatus and method, which suppress deterioration of synthetic speech quality by selecting synthesis units to be registered in a synthesis unit inventory in consideration of the influences of concatenation and modification distortions. [0005] The present invention is described with use of synthesis unit and synthesis unit inventory of synthesis units and synthesis unit inventory. The synthesis unit represents a part for speech synthesis, and the synthesis unit can be called as a synthesis unit. [0006] In order to attain the objects, a speech synthesis apparatus of the present invention, comprising: distortion output means for obtaining a distortion produced upon modifying a synthesis unit on the basis of predetermined prosody information; and unit registration means for selecting a synthesis unit to be registered in a synthesis unit inventory used in speech synthesis on the basis of the distortion output from said distortion output means. [0007] In order to attain the objects, a speech synthesis method of the present invention, comprising: a distortion output step of obtaining a distortion produced upon modifying a synthesis unit on the basis of predetermined prosody information; and a unit registration step of selecting a synthesis unit to be registered in a synthesis unit inventory used in speech synthesis on the basis of the distortion output from the distortion output step. [0008] Other features and advantages of the present invention will be apparent from the following descriptions taken in conjunction with the accompanying drawings; in which like reference characters designate the same or similar parts throughout the figures thereof. BRIEF DESCRIPTION OF THE DRAWINGS [0009] The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate embodiments of the invention and, together with the descriptions, serve to explain the principle of the invention. [0010] FIG. 1 is a block diagram showing the hardware arrangement of a speech synthesis apparatus according to an embodiment of the present invention; [0011] FIG. 2 is a block diagram showing the module arrangement of a speech synthesis apparatus according to the first embodiment of the present invention; [0012] FIG. 3 is a flow chart showing the flow of processing in an on-line module according to the first embodiment; [0013] FIG. 4 is a block diagram showing the detailed arrangement of an off-line module according to the first embodiment; [0014] FIG. 5 is a flow chart showing the flow of processing in the off-line module according to the first embodiment; [0015] FIG. 6 is a view for explaining modification of synthesis units according to the first embodiment of the present invention; [0016] FIG. 7 is a view for explaining a concatenation distortion of synthesis units according to the first embodiment of the present invention; [0017] FIG. 8 is a view for explaining the determination process of distortions in synthesis units; [0018] FIG. 9 is a view for explaining the determination process by Nbest; [0019] FIG. 10 is a view for explaining a case where synthesis unit units are represented by mixture of a diphone and half-diphone, according to the third embodiment of the present invention; [0020] FIG. 11 is a view for explaining a case where synthesis unit units are represented by half-diphones, according to the fourth embodiment of the present invention; Continue reading... Full patent description for Speech synthesis apparatus and method, and storage medium Brief Patent Description - Full Patent Description - Patent Application Claims Click on the above for other options relating to this Speech synthesis apparatus and method, and storage medium patent application. ### 1. Sign up (takes 30 seconds). 2. Fill in the keywords to be monitored. 3. Each week you receive an email with patent applications related to your keywords. Start now! - Receive info on patent apps like Speech synthesis apparatus and method, and storage medium or other areas of interest. ### Previous Patent Application: System and methods for conducting an interactive dialog via a speech-based user interface Next Patent Application: Voice output device and voice output method Industry Class: Data processing: speech signal processing, linguistics, language translation, and audio compression/decompression ### FreshPatents.com Support Thank you for viewing the Speech synthesis apparatus and method, and storage medium patent info. IP-related news and info Results in 1.14697 seconds Other interesting Feshpatents.com categories: Tyco , Unilever , Warner-lambert , 3m |
||