انت هنا الان : شبكة جامعة بابل > موقع الكلية > نظام التعليم الالكتروني > مشاهدة المحاضرة

Digital Speech Processing

الكلية كلية تكنولوجيا المعلومات     القسم قسم البرامجيات     المرحلة 3
أستاذ المادة إيمان صالح صكبان الرواشدي       3/27/2011 6:23:11 PM

Lecture11

Digital Speech Processing

 

 

 

1.Introduction

 

n     The frequency range of sounds perceived by humans is something between 30 Hz (hertz, cycles per second) and 20,000 Hz. Above 20K we hear nothing.

 

n     The acoustic file need to format in which speech data can be saved. In recent years the most popular format (most frequently used format) has been Microsoft WAV format. There are many other formats( MKV, AVI.MP3). The difference is in how many samples occur per second, and exactly each sample is represented.

 

 

2.Applications of Digital Speech Processing

 

The first step in most applications of digital speech processing is to convert the acoustic waveform to a sequence of numbers. Most modern A-to-D converters operate by sampling at a very high rate, applying a digital low pass filter with cutoff set to preserve a prescribed bandwidth, and then reducing the sampling rate to the desired sampling rate, which can be as low as twice the cutoff frequency of the sharp-cutoff digital filter. This discrete-time representation is the starting point for most applications. From this point, other representations are obtained by digital processing

 

 

 

 

n     The field of voice processing encompasses five broad technology areas, including:

 

n     voice coding, the process of compressing the information in a voice

 

      signal so as to either transmit it or store it economically over a  

 

      channel  whose bandwidth is significantly smaller than that of the  

 

       uncompressed signal;

 

n     voice synthesis, the process of creating a synthetic replica of a voice signal so as to transmit a message from a machine to a person, with the purpose of conveying the information in the message;

 

n      speech recognition, the process of extracting the message information in a voice signal so as to control the actions of a machine in response to spoken commands;

 

n      speaker recognition, the process of either identifying or verifying a speaker by extracting individual voice characteristics, primarily for the purpose of restricting access to information (e.g., personal/ private records), networks , or physical premises.

 

n     • spoken language translation, the process of recognizing the speech of a person talking in one language, translating the message content to a second language, and synthesizing an appropriate message in the second language, for the purpose of providing two-way communication between people who do not speak the same language.

 

 

 


المادة المعروضة اعلاه هي مدخل الى المحاضرة المرفوعة بواسطة استاذ(ة) المادة . وقد تبدو لك غير متكاملة . حيث يضع استاذ المادة في بعض الاحيان فقط الجزء الاول من المحاضرة من اجل الاطلاع على ما ستقوم بتحميله لاحقا . في نظام التعليم الالكتروني نوفر هذه الخدمة لكي نبقيك على اطلاع حول محتوى الملف الذي ستقوم بتحميله .