• Sonuç bulunamadı

Two-Level Description of Turkish Morphology

N/A
N/A
Protected

Academic year: 2021

Share "Two-Level Description of Turkish Morphology"

Copied!
1
0
0

Yükleniyor.... (view fulltext now)

Tam metin

(1)

T w o - l e v e l

D e s c r i p t i o n o f T u r k i s h M o r p h o l o g y

Kemal Oflazer

Department of Computer Engineering and Information Science Bilkent University, Ankara, Turkey

Fax: (90-4) 266 4127 e-maih ko@trbilun.bitnet

1 Introduction

This poster paper describes a full scale two-level mor- phological description (Karttunen, 1983, Kosken- niemi, 1983) of Turkish word structures. The description has been implemented using the PC- KIMMO environment (Antworth, 1990) and is based on a root word lexicon of about 23,000 roots words. Almost all the special cases of and exceptions to phonological and morphological rules have been im- plemented.

Turkish is an agglutinative language with word structures formed by productive affixations of deriva- tional and inflectional suffixes to root words. Turkish has finite-state but nevertheless rather complex mor- photactics. Morphemes added to a root word or a stem can convert the word from a nominal to a ver- bal structure or vice-versa, or can create adverbial constructs. The surface realizations of morphologi- cal constructions are constrained and modified by a number of phonetic rules such as vowel harmony.

2

Two-level description of Turkish

morphology

The phonetic rules of contemporary Turkish have been encoded using 22 two-level rules while the mor- photactics of the agglutinative word structures has been encoded as finite-state machines for verbal, nominal paradigms. Our lexicons are based on the comprehensive word list that we have compiled for our spelling checker developed earlier (Solak and Oflazer, 1992). We have lexicons for nouns, ad-

jectives verbs, compound nouns, proper nouns, pro- nouns, adverbs, connectives, exclamations, postposi- tions, acronyms, technical words, special cases, There

are total of 18,500 nominal (nouns + adjectives) roots and about 2,450 verbal roots. There are about 100 lexicons for suffixes.

3 E x a m p l e Output

Here we provide a sample output from our imple- mentation (slightly edited for proper orthography): Input Morpheme Struct. cah~manm cah~-I-mA+Hn +nHn ~:a h,~-I-mA-I-nH n G lOSS English meaning V(¢ah~)+VtoN(ma)+2PS-POS+GEN ot your =ork(mg) V(¢aI,~)-t-VtoN(ma)+GEN o/the work(ing) N(¢ocuk)+3PS-POS his/her child ~OCU~U ~ocuk+sH ¢ocuk+yH ahnml~ al+Hn+ymH~ al+nHn+ymH,~ al$m+ymH~ al-t-Hn-l-mH~ al-I-Hn-l-mH~ ahn-t-mH~ ahn-l-mH~ boynu boy$un+sH boy$un+yH N(¢ocuk)+ACC child (accusative) N(al)+2PS-POS-I-NtoVO-I-NARR+3PS

(it) was your red (one)

N(al)+GEN+NtoVO+NARR+3PS

(it) belongs to the red (one)

N(al,n)+NtoVO+NARR+3PS

(it) was a forehead

V(al)+PASS+VtoAdj(mis) (a) taken (object)

V(al)+PASS+NARR+3PS

it was taken

V(ahn)+VtoAdj(mis)

Can) offended (person)

V(ahn)+NARR+3PS

s/he was offended

N(boyun)+3PS-POS

(his/her) neck

N(boyun)+ACC

neck (accusative)

4

Conclusions

This poster has presented a summary of the first full scale implementation of two-level description of Turkish morphology. We have been using this de- scription as a morphological parsing module in a number of applications like LFG parsing, ATN pars- ing and semantics analysis of Turkish sentences.

References

[Antworth, 1990] Evan L. Antworth. PC-KIMMO:

A two-level processor for Morphological Analysis.

Summer Institute of Linguistics, Dallas, Texas, 1990.

[Karttunen, 1983] Lauri Karttunen. KIMMO: A general morphological processor. Texas Linguis-

tic Forum, 22:163- 186, 1983.

[Koskenniemi, 1983] Kimmo Koskenniemi. Two- level morphology: A general computational model for word form recognition and production. Publi- cation No: 11, Department of General Linguistics, University of Helsin, 1983.

[Solak and Oflazer, 1992] Ay§m Solak and Kemal Oflazer. Parsing agglutinative word structures and its application to spelling checking for Turkish. In

Proceedings of the 15 th International Conference on Computational Linguistics, volume 1, pages 39

- 45, Nantes, France, 1992. International Commi-

tee on Computational Linguistics.

Referanslar

Benzer Belgeler

Stranger still, in the 2010 Asian Games a competitor was disqualified because officials said she had worn extra sensors in her socks to score extra points.. But films showed

 Contestants wear an electronic chip around their ankle to measure the time each stage takes..  There are ‘transition areas’ on the course for contestants to get ready for

a.. When a wrestler’s shoulders are held on the ground. Part of a bout that lasts two minutes. Wrestling using only your upper body to attack and hold. Where the wrestling bouts

It is a team sport in which two teams of seven players each pass a ball to throw it into the goal of the other team?. It is a thrilling sport to watch, sometimes with more than

In 1997 he graduated from Güzelyurt Kurtuluş High School and started to Eastern Mediterranean University, the Faculty of Arts and Sciences, to the Department of Turkish Language

Using this example as a guide, we define the integral

The claim is that low education group(LEG) and high education group(HEG) subjects use different apology strategies in different apology situations. Moreover, it

The intellectual climate not only influenced the reception of the film, but also the production of the film - for, the intellectual climate not only influenced the