• Sonuç bulunamadı

- Speech Recognition Systems, - Language Recognition Systems and - Speaker Recognition Systems

N/A
N/A
Protected

Academic year: 2021

Share "- Speech Recognition Systems, - Language Recognition Systems and - Speaker Recognition Systems "

Copied!
3
0
0

Yükleniyor.... (view fulltext now)

Tam metin

(1)

1

1. INTRODUCTION

Speech is always regarded as the most powerful form of communication. It describes the characteristics of a person, such as the gender, attitude, emotional state, health situation, and the identity of the speaker. As shown in Figure 1.1, basically there are three recognition systems as far as the speech signal is concerned:

- Speech Recognition Systems, - Language Recognition Systems and - Speaker Recognition Systems

Speech recognition is about recognizing a spoken word or sentence. Language recognition is about recognizing the words and sentences of a language and determining which language is spoken.

Speaker recognition systems are divided into two groups: speaker recognition, and speaker verification. In speaker recognition, the task is to use a speech sample to determine the identity of the person that produced the speech from among a number of speakers. In speaker verification, the task is to determine whether or not a person who claims to have produced the speech has in fact done so. Speaker verification has many practical applications and is mainly used in remote voice based password verification systems where the user enters his or her password by means of speech.

Speaker recognition systems can be divided into two methods: text-dependent methods, and text-independent methods. In a text-dependent system, the identity of the speaker is based on his or her speaking a specific phrase, like passwords, PIN codes, credit card numbers, etc.

Here, the system can recognise the speaker only when the expected word has been spoken.

Such a system is commonly used in many security based real-time applications. In a text-

independent system, the speaker is identified irrespective of what he or she is saying. In

general, it is more difficult and less reliable to recognise a speaker in such a system.

(2)

2

Figure 1.1 Extracting information from speech [20]

The goal of this thesis is the development of a MATLAB based speaker recognition system.

The Mel Frequency Cepstral Coefficients are used for the speech feature extraction, and a vector quantization algorithm is used for the speech feature matching. The developed system is Graphical User Interface (MENU type), where a user can load new speech signals to the database, select and play a speech signal, display the time domain graphics of each speech signal, display the power spectrums of the signals, or recognize a speaker from his or her speech signal by finding a match in the database. The thesis covers the basis of speech production, perception, and digital signal analysis techniques. These are the fundamental blocks required to understand the various methods and procedures that are used in the thesis.

Chapter 2 describes briefly the concepts of speaker recognition systems. A literature search is carried out in this Chapter to find out about the previous research work done in this field.

Chapter 3 is about the speech production and the characteristics of voiced and unvoiced

speech are described in this Chapter. In addition, the technical characteristics of speech signals

are outlined briefly.

(3)

3

Chapter 4 describes the speaker identification systems and the methods used to identify speech signals. Both text dependent and text independent speech identification methods are described in this Chapter.

Chapter 5 is about the speech feature extraction and vector quantization techniques. This Chapter describes how the fundamental features describing a speech signal can be extracted using various techniques. In particular, the Mel-Frequency Cepstrum Coefficient technique used in this thesis is described in detail.

Feature matching is one of the fundamental processes in speaker identification, After extracting the features of a speech, a feature matching method is employed to find a speaker among a number of speakers. Chapter 6 describes the feature matching technique used in this thesis.

Chapter 7 describes in detail the MATLAB based speaker recognition system developed by the author.

Finally, the conclusions and suggestions for future work are given in Chapter 8.

The thesis is completed with References, and an Appendix is given which lists the MATLAB

program developed by the author.

Referanslar

Benzer Belgeler

In connection with the above changes in the viewpoint on neighborhood systems, we actually return to the problem of modeling objects by classical systems of the form

In first part, the identification of speakers is implemented by three discriminative applications which are: SVM, K-NN, NB and by also studying the impact of

This paper presents a general view of firstly the speech features used for generally template based speaker recognition, such as: intensity, formant frequency, pitch,

Kemik sement implantasyon sendromu hipoksi, hipotansiyon, kardiyak aritmiler, pulmoner vasküler direnç artışı ve kardiyak arrest ile ilişkilidir ve sement kullanılan ortopedik

Finally, a hidden markov model (HMM) [12] is a statistical model which may be used for text dependent recognition of speakers. Roughly speaking, they can be viewed as a combination of

This chapter describes, the concepts of speaker recognition, the speaker processing groups (speaker identification, speaker verification) and the methods of the speaker

neuromas: Results of current surgical management. KlhC;T, Pamir MN: Gamma Knife cerrahisi: Teknigi, endikasyonlan, sonuc;lan ve SInlrlan. Kondziolka D, Lunsford LD, Flickinger

zayedesinde daha önce Paris’te adına üç kez özel m üzayede düzenlenen F ik re t M u a lla ’n ın, 1940-1958 yılları arasında F ransa’da yaptığı 61 yapıt