Effect of Self-Evaluation-Based Oral Reading Method in Elementary School on Reading Fluency and Reading Comprehension

(1)

Vol.9(2), pp. 437-462, March 2022

Available online at http://www.perjournal.com ISSN: 2148-6123

http://dx.doi.org/10.17275/per.22.48.9.2

Effect of Self-Evaluation-Based Oral Reading Method in Elementary School on Reading Fluency and Reading Comprehension

Ferhat Saat

Elementary Teacher, Nevşehir, Turkey ORCID: 0000-0002-0978-857

Emine Gül Özenç

^*

Education Faculty, Niğde Ömer Halisdemir University, Niğde, Turkey ORCID: 0000-0003-3161-4251

Article history Received:

04.03.2021

Received in revised form:

14.09.2021

Accepted:

21.10.2021

The study was aimed to find out the effect of oral reading method based on self-evaluation on fluent reading and reading comprehension of fourth grade elementary students. The search was designed using a mixed method of nested mixed method patterns. In the quantitative dimension of the search from the quasi-experimental models, the unaligned pretest- posttest control grouped pattern was used. In qualitative dimension, interview sand document analysis were conducted. The study group of the research consisted of the fourth grade students at the a Primary School in the province of Nevsehir in Turkey in 2018-2019 academic year. In order to prevent time and cost loss in the determination of the study group, non-selective status sampling method was used. Based on the self-evaluation reading method, the applications were made in the class of their searcher and the other one in the school was determined as the control group. Experimental and control groups were determined (both n = 10). Statistical techniques were used to compare the pre-posttest results of each group. In addition, content analysis method was used to determine the opinions of the experimental group about the application. As to the result it has been observed that reading method based on self-evaluation is effective in improving the correct reading, reading, and reading comprehension skills of the students. It was determined that there was not a significant difference in gender except reading speed in the development of these skills.

Key words:

Fluent reading;

Oral reading;

Reading comprehension;

Self-evaluation

Introduction

Although the concept of reading was initially expressed as learning to read and write (literacy), later interpretation also began to appear as a dimension of reading. Also including this dimension, Akyol (2013, p. 33) defined reading as a dynamic interpretation process

*Correspondency:egmortas@hotmail.com

(2)

occurring with an interaction between the writer and the reader, whereas Grabe and Stoller (2013: p. 3) defined it as the ability to infer meaning from a written page and interpret it.

Reading was also defined as combining visual and non-visual information in the mind and making a connection between one thought and another (Johnson, 2017: p.4). Sidekli and Yangın (2005, p. 394) on the other hand explained reading as a cognitive activity based on the perception, meaning, comprehension, and interpretation of words through the senses. These definitions indicate that the concept of reading includes not only the ability to vocalize words but also interpretation.

Oral Reading

Öz (2011, p. 212) defined oral reading as reading a text with a voice that the listener can hear and listen with pleasure. Vuillemin (2007) regarded oral reading as a priority and also a complex activity that requires deep learning and should not come after basic learning. In fact, oral reading is a way of sharing pleasure, recognition, and knowledge (Jean, 1999: p.52).

Kavcar, Oğuzkan, and Sever (1995, p. 43) defined oral reading as vocalizing words or clusters of words perceived by the eye with the help of the speech organs. Göçer (2007, p.36) added the phrase “perception of meaning” to these definitions and defined oral reading as seeing writing, vocalizing words, and perceiving meaning. As can be understood from this definition, comprehension, one of the basic qualities of reading, is also a valid rule here. In addition, components such as stress, intonation, and orthographic reading, which are the elements of oral reading, are also important in interpretation and are effective in the enjoyment of the listeners.

A good oral reading education has many benefits for students. With oral reading, students improve their voice, stress, and intonation skills, and their word recognition and comprehension skills (Güneş, 2011, p. 9). They see their reading errors better. While reading words, they realize their mistakes such as inversions, skips, repetitions, and additions (Akyol, 2013, p. 243-244).

These mistakes can be corrected with the help of oral reading or by different methods. Such mistakes made in reading make it difficult to interpret, which is the ultimate aim of reading. In order to achieve the ultimate goal, poor readers need to overcome such problems and reach a fluent reading level. For these reasons, oral reading is mostly used in the first years of elementary education (Yılmaz, 2014, p. 87-88).

Reading Fluency

The basis of fluency in reading is explained by three components. The first of these is word recognition and word differentiation (Dündar & Akyol, 2014, p.364). This creates the accuracy component of reading fluency. Correct reading refers to accurately recognizing and decoding words, understanding the alphabetic rules and accurately voicing letters (Ehri &

Sandra, 1998, p. 135). Fast reading, the second component of reading fluency, is the ability to send more messages to the brain in a short time (Beydoğan, 2012, p.4). The third component of reading fluency is prosody. Dowhower (1991, p. 3) defined prosody as reading with an effective rhythm and melody patterns. In another definition, prosody was defined as reading the text by paying attention to intonations, emphasis, and pauses , and by obeying punctuation and spelling rules (Kodan & Akyol, 2016, p.8). Zutell and Rasinski (1991, p. 211) defined reading fluency as the reading reflecting the writer’s feelings to his or her reading automatically without making much effort to recognize words, and by taking into account the units of meaning in the sentence and applying the intonations and emphasis where necessary. Being able to add the writer’s emotion made it necessary to interpret. If the reader understands what he or she is reading, he or she will intonate and may reflect the writer’s emotion in his or her reading. For this reason, it is important for the individual who fluently reads to read correctly and quickly in

(3)

to interpret the text (Baydık, Ergül, & Bahap Kudret, 2012, p.779). Meyer and Felton (1999) consider fluency as the ability to read a text quickly, smoothly, effortlessly, automatically with little attention to reading and decoding. Akyol (2019, p. 4) defined reading fluency as the reading that does not include turning back and repetition of words by paying attention to punctuation marks, emphasis, and intonations, that avoids syllabification and unnecessary pausing, and that is done as if speaking with meaning units. Kuhn, Schwanenflugel, &

Meisinger, (2010) referred to reading fluency as a structure consisting of components such as the accurate reading of words, automatization in reading, and prosodic, enabling the reader to establish meaning. It is argued that a skilled reader is mastered beyond the simple level of accuracy. That is, a reader can only be classified as fluent when his or her correct reading is fast enough. Among the various components activated in the reading process, reading speed has become an important feature of the concept of reading fluency (Doehring, 1976 cited in Breznitz, 2006: 2). The sub-skill of pronouncing words aloud is used in meaningful reading fluency. Furthermore, the knowledge of syllabification of sounds is important in reading fluency (LaBerge & Samuels, 1977). In addition to these, oral reading fluency is a necessary feature to define good reading (Allington, 1983).

The developmental result of reading skills, fluency occurs when decoding skills are attained and word recognition occurs automatically. Although reading fluency is addressed on its own, fluency is considered a typical and critical area of intervention because of its connection with reading comprehension. Consistent with the National Institute of Child and Human Development’s (HICHD) Influential Report of the National Reading Panel perspective, fluency is seen as a bridge between word recognition and reading comprehension (Pikulski & Chard, 2005 cited in Rasinski, 2012). Fuchs, Fuchs, Hosp, and Jenkins (2001) found a correlational relationship at the level of .91 between middle school students’ reading fluency and reading comprehension, while Quirk and Beem (2012) found a significant relationship between 2nd, 3rd, and 5th grade students’ reading fluency and comprehension. Yıldız ( 2013) reached a similar conclusion.

Self-Evaluation

Self-evaluation refers to making judgments about one’s own learning (Boud &

Falchikov, 1989, p.529). Klenowski (1995, p. 145) defined self-evaluation as an individual’s evaluation and judgment about his or her own achievement in any subject and determination of his or her strengths and weaknesses in order to increase learning outcomes. Self-evaluation was also expressed as students’ determining the standards and/or criteria to be applied to their works and making judgments about the extent to which they meet these criteria and standards (Boud, 1986 cited in McDonald & Boud, 2003). In addition, there are three main factors in the success of self-evaluation: Self-evaluation = Model + Measurement + Management (EFQM cited in Hillman, 1994: p.29).

With self-evaluation, students are expected to identify their own reading mistakes, evaluate their own speed, criticize their reading prosodies, and have a positive view on reading. The study was based on students’ finding their own mistakes, rather than a listener because an adult showing the child his or her mistakes may damage the child’s self-efficacy (Ulu & Başaran, 2013, p.3).

Study Purpose

In order for students to be good readers, it is important to solve reading problems at an early age because solving these problems at an early age significantly affects the success of the

(4)

child both in education and social life. Activities such as reading contests, silent readings, and shared reading, which are common in the first years of elementary school may be shown as examples done to overcome this problem. It is believed that reading activities such as reading contests based on competitive understanding and shared readings between students who read well in schools have some shortcomings. Another point is that the student’s reading mistakes are usually corrected by a friend or a teacher. This necessitates a guide in the practices done.

Another important point is that we are in a period where children spend more time with technological devices. Using technology as a tool in children’s reading is important. For all these reasons, it is believed that self-evaluation-based oral reading practices will be important since the student who is not at the desired reading level will not have the anxiety of being criticized, the student will have the responsibility of learning, technology will be used effectively in education, the student will make progress at his or her own pace, the student will not need a guide while reading, the student will be provided opportunities to repeat and correct as much as he or she wants his or her reading, they provide convenience to mothers and fathers especially in the first years of elementary school education and students may become interested in reading with different methods instead of the same methods.

The literature review revealed that there are numerous studies on reading and comprehension (Çankal, 2018; Cüre, 2018; Montgomerie & Little, 2014; Strickland, Boon & Spencer, 2013;

Roundy & Roundy, 2009; Therrien & Hughes, 2008; Morra & Tracey, 2006; Therrien, 2004;

Bryant, Vaughn, Linan-Thompson, Ugel, Hamff, & Hougen, 2000; Samuels, 1997; Rasinski, 1990). However, there were no studies conducted on reading fluency based on self-evaluation.

The general aim of the research was to examine the effect of oral reading method based on self- evaluation in elementary schools on reading fluency and reading comprehension. In line with these views, the following study questions were developed:

(1) Does the oral reading method based on self-evaluation have an effect on correct, fast, and prosodic reading (reading fluency) in elementary school?

(2) Does the oral reading method based on self-evaluation have an effect on reading comprehension in elementary school?

(3) How does the effectiveness of the oral reading method based on self-evaluation in elementary school change in time?

(4) What are students’ views on the oral reading method based on self-evaluation in elementary school?

Method

The study employed the mixed method, a research method including both qualitative and quantitative data. The mixed method is formed of the collection and analysis of quantitative and qualitative data in studies conducted within the scope of the research. In the mixed method, instead of using only qualitative or quantitative data, the researcher uses both methods to contribute to the full understanding of the problem (Creswell, 2004/2017, p.19). In this way, the researcher not only performs his or her research in more detail but also finds answers to the problems of the research by using all data resources since he or she does not limit himself or herself. In the study, nested mixed design, one of the mixed method designs, was used. This pattern involves the concurrent or sequential use of data. Here, qualitative or quantitative data are more dominant in the study. The non-dominant data type is used to support the dominant data (Creswell, 2004/2017, p.16). In the present study, the first three sub-problems of the study were problems requiring quantitative data, while the last sub-problem required qualitative data.

For this reason, the mixed method was used in this study. In addition, three of the four sub-

(5)

problems requiring quantitative data and only one sub-problem requiring qualitative data were effective in the use of nested mixed design in the study because the study is indeed more empirical. The qualitative data were used to determine the students’ views on the practices done.

In the quantitative dimension of the study, randomized pretest-posttest control group design, one of the quasi-experimental designs, was used. In this design, which group will be the experimental one and which group will be the control one is carefully randomly decided, and attention is paid to the groups being as similar as possible (Karasar, 2017, p.137). Since this study will be carried out with the existing two 4th grade classrooms in the school, it was not possible to equalize the classrooms since each student was attending his or her own classroom, and it was not possible to adjust the groups according to the criteria. For this reason, the use of this design was deemed appropriate in the study, and the practices were designed according to this design.

Study Design

Aiming to reveal the relationship between 4th grade students’ functional literacy levels and their problem-solving skills, this research employed the relational survey design, one of the general survey designs. Survey design aims to meticulously and thoroughly describe a situation or a phenomenon that participants have previously experienced or still experiencing (Karasar, 2014). The survey design describes the current situation of a subject. In survey design, data on situations with more than one feature can be collected and the relationship level between them can be questioned. In this method, in order to search the relation between more than two variables, these variables are examined without interfering with them in any way (Büyüköztürk, Kılıç-Çakmak, Akgün, Karadeniz, & Demirel, 2014). This research aimed to find out the relation between 4th grade students’ functional literacy levels and their perceptions of problem- solving skills. Accordingly, scales were administered to determine both variables, and the relationship level of these variables was determined by performing the necessary statistical analyses.

Study Group

The study group consisted of students enrolled in the classrooms 4A and 4B at an elementary school in the Derinkuyu district of the province of Nevşehir in Turkey during the second semester of the 2018-2019 academic year. In the study, the classrooms were selected randomly as the experimental and the control group. There was no equalization of the groups due to the study design. However, the fact that the groups were made up of students from the same village and that their socio-economic levels were close to each other were the aspects that made the groups similar. There are 12 students in the experimental group and 13 in the control group. Two students in the experimental group were not included in the study because one of them had a mild mental disability and the other had a stuttering problem. In the end, the experimental group had 10 students (six girls, four boys). Three of the students in the control group were not included in the study because one of them was always absent and the other two had a mental disability. In the end, the control group had 10 students (five girls, five boys).

Data Collection Tools

Qualitative and quantitative data collection tools were used in the study.

(6)

Figure 1. Data Collection Tools

Inventory, scale, and tests were used in the quantitative dimension of the study. To wit, the Informal Reading Inventory, Prosodic Reading Scale, and Reading Comprehension Test.

Interviews and written student views were used in the qualitative dimension of the research.

Semi-Structured Interview Form and Voice Recordings were used in the interviews, and finally Written Student Views were used.

Quantitative Data Collection Tools

This section discusses the error analysis inventory, prosodic reading scale, and reading comprehension test used in quantitative data collection.

Informal reading inventory

Informal Reading Inventory adapted to Turkish by Akyol (Cited in Akyol, 2013, p. 101) from Haris & Sipay (1990), Ekwall & Shanker (1988), and May (1986) was used to measure reading speed and correct reading levels, dimensions of reading fluency, and to measure reading mistakes. According to this inventory, skips, additions, repetitions, wrong readings, reversals, and words given by the teacher after five seconds were identified as reading mistakes. The table presents the percentage of word recognition according to the intersection of word count and reading mistakes. In addition, Akyol (2013, p. 102) discussed students’ levels at three levels based on the reading levels calculation table he adapted from Ekwall and Shanker (1988, p.

414). These levels are specified as Independent Reading Level, Instructional Level for word recognition and comprehension.

According to the inventory, the reading speed of the students and the correct reading percentage were calculated based on the following formulas (Keskin, 2012, p.50). The readings of each student were recorded, and the reading speed and percentage of correct reading were calculated separately for each student.

Reading speed = Number of correctly read words x 60 sec.

Reading speed of the whole text (sec.) Accuracy percentage =

Number of correctly read words x 100 Total number of words in the text

(7)

Prosodic reading scale

The Prosodic Reading Scale developed by Keskin, Baştuğ, and Akyol (2013) was used to measure prosodic reading skills, which is the third dimension of reading fluency. The scale is a Likert-type scale consisting of 15 items developed to measure only prosody. The scale aims to measure the dimensions of prosodic reading such as intonation, emphasis, reading with meaning units, reading by reflecting the emotion in the text, reading rhythm, and voice features.

The construct validity of the scale is KMO=.97 and Barlett’s test is (p=.00; <. 01). Cronbach’s alpha value is 0.981 (Keskin, Baştuğ, Akyol, 2013, p. 171). The Cronbach’s alpha value of the scale in the present study was 0.846. The feature desired to be measured by each item in the scale was rated as always observed (4), mostly observed (3), occasionally observed (2), rarely observed (1), and never observed (0).

In order to ensure the scoring consistency of the scale, a Turkish teacher was asked to watch the reading videos after the necessary explanations were provided and to score them based on the prosody scale. The correlation between the researcher’s scores and the Turkish teacher’s scores was examined. Accordingly, the Spearman’s rho correlation coefficient between the pretest measurements was calculated as r = 0.971, the Spearman’s rho correlation coefficient between the mid evaluation measurements as r = 0.951, and the Spearman’s rho correlation coefficient between the posttest measurements as r = 0.722.

Reading comprehension achievement test

Two tests were developed by the researcher to determine the reading comprehension levels. One of the tests was used as pretest and posttest, while the other was used as mid- evaluation. The pretest and posttest text was developed based on the narrative text named “İp Bacaklı Çocuk (The Boy with Rope Legs)” in the Turkish textbook taught in the 2017-2018 academic year, and the mid-evaluation was developed based on the narrative text

“Ölümsüzleşen Bahçe (Immortalized Garden)” from the same book. The steps determined by Turgut and Baykul (2014, pp. 215-216) were followed while developing both tests. These steps and the work done are:

Test purpose. These tests were developed to determine the students’ reading comprehension levels in narrative texts.

Determination of the properties to be measured in the test: According to the 4th Grade Turkish Teacher’s Edition Book used in the 2017/2018 academic year, there are 41 learning objectives about reading comprehension. This study only focused on narrative texts. Therefore, other genres (poetry, informative text) were not included in the study. For this reason, the content validity ratio developed by Lawshe (1975, p.567) was used to determine which of these 41 learning objectives were about narrative texts and which of them could be measured with multiple-choice questions. Accordingly, first, the learning objectives were rated as "the learning objectives measure the targeted structure”, “the learning objective is related to the structure but unnecessary”, and “the learning objective does not measure the related structure”. The forms developed according to this rating were sent to experts made up of three faculty members, three Turkish teachers, and four classroom teachers, and they were asked to give their views. Based on these views, the content validity ratios (CVR of the learning objectives were determined.

The content validity ratio is obtained by subtracting 1 from the number of experts stating

“essential” for an item divided by the total number of experts expressing their opinion on the item (Yurdugül, 2005, p. 2).

(8)

For the convenience of calculating content validity ratios, the minimum values (content validity criteria) of CVRs were transformed into a table at α=0.05 significance level (Veneziano &

Hooper, 1997 cited in Yurdugül, 2005. p. 2). Accordingly, the minimum values for the number of experts to the items give the statistical significance. The content validity ratios of the evaluations made by the ten experts must be greater than 0.62 at the level of α=0.05 significance. Nineteen items greater than this value were determined, and therefore these 19 learning objectives were included in the study.

Writing the items. A total of 48 questions, with at least two questions for each item, were prepared for the 18 learning objectives stated above. The same was done for mid-evaluation. A question pool was created for the pretest with 48 questions in total and for the mid-evaluation with 46 questions.

Item review and development of the trial form. The questions in the pool were presented to one faculty member, two Turkish teachers, and three classroom teachers, and the necessary corrections were made on the questions based on the feedback received.

Trial implementation. The 48 questions in the finalized pre-post test were administered to 92 students, and the 47 questions in the mid-evaluation were administered to 85 students. The implementation was carried out with 4the grade students going to school in the Derinkuyu district of the province of Nevşehir in Turkey.

Scoring the implementation results, item analysis, and item selection. Item discrimination is rated between -1 and +1. This value being negative values indicates that the item is reverse.

Therefore, items with negative values should be removed from it (Büyüköztürk, Kılıç Çakmak, Akgün, Karadeniz, & Demirel, 2013, p. 123). Item difficulty index is the ratio of students who answered the items correctly (Akbulut & Çepni, 2013, p.27). When choosing an item for a test, the difficulty should be at a medium level (0,50) and that the discrimination should be between 0,20 and 0,80. Also, the discrimination should be high as possible (0,30 and above) (Özçelik, 2016, p.217). In line with these explanations, after the implementation, each student’s answer was recorded as 1 if it was correct, or 0 if it was wrong or empty. Necessary statistical works were done on these data. In the works done, item difficulty and discrimination indices were calculated for both tests, and the item difficulty of the pre-post test reading comprehension questions ranged between 0,28 and 0,88. Items discrimination values were between -0,36 and 0,63 in item distinction. The items with an insufficient level of difficulty and discrimination were removed from the test. Items with sufficient levels were included in the actual test.

According to the mid-evaluation statistics, the item difficulty varied from 0,24 to 0,83. Item discrimination values were between -0,10 and 0,64 in item discrimination. The items with an insufficient level of difficulty and discrimination were removed from the test. Items with sufficient levels were included in the actual test.

Development of the final test. While developing the final test, the items with high item discrimination were selected. In the pre-post test text named “The Boy with Rope Legs”, the first question (Question 8) for the 7th learning objective was eliminated because its distraction was negative. Since the two other questions (Questions 7 and 8) were easy, the responses of the question with the highest item discrimination (Question 9) were corrected and included in the actual test. In the mid-evaluation named “Immortalized Garden”, the distraction of two of the questions developed for the 32nd learning objective was low (Question 40 and Question 41). If an item has 0.19 or less discrimination, it is a weak item. If the item cannot be improved after

(9)

the corrections, it should be removed from the test (Turgut 1992; Tekin 2000; Akbulut & Çepni, 2013, p. 29). Among these two questions, Question 41, which had a high distractor value, was corrected and included in the actual test in order not to spoil the content validity. The other questions were used since there was no problems. There were 18 questions in the actual tests for both tests.

Reliability of the Test. The Kuder-Richardson Formula 20 (KR-20) reliability coefficient was calculated for both tests (Büyüköztürk et al. 2013, p. 111). According to Büyüköztürk (2012, p. 171), a KR-20 reliability coefficient of 0.70 and above is a necessary condition for test scores.

As a result of statistical works, the KR-20 value for the pre-post test (The Boy with Rope Legs) was calculated as 0.82. For the mid-evaluation (Immortalized Garden), the KR-20 value was calculated as 0.81. Both results showed that both tests measured the properties desired to be measured, that is, the tests were valid.

Qualitative data collection tools

Semi-structured interview form. A semi-structured interview form was developed to determine students’ views about the self-evaluation-based reading method. For the semi-structured interview form, the researcher plans the questions he or she wants to ask in advance. But the researcher does not completely depend on the questions. The form provides partial flexibility to the researcher. The researcher can reorganize the questions according to the course of the interview (Ekiz, 2013, p.63). For this purpose, a literature review was conducted to develop the form. In line with the findings from the literature review, a trial interview form was developed.

The content validity of the developed semi-structured interview form was ensured by taking the opinions of experts including four faculty members and one classroom teacher. Questions were directed to three students in the experimental group using the developed form. The form was finalized after the places with readability problems were corrected. This form was used in the interviews with the experimental group, and the interviews were recorded with a tape recorder.

Written student views. They were aimed to diversify the obtained data to determine the students’ thoughts about the process. For this aim, students were asked to state their thoughts about the practices in the study in writing.

Reliability and Validity of the Qualitative Data Collection Tools. In this section, the works done to increase the reliability and validity of the semi-structured interview and document analysis methods, which were qualitative data collection tools, are discussed in detail.

Downes and McMillan (2000) stated that the most important criterion in the evaluation of qualitative research is that the data obtained, and the analysis of these data and the results credible and trustworthy. This depends on qualitative research’s validity and reliability. For this reason, taking detailed field records in a qualitative study, keeping audio and video records, providing quotations from the participants, and having more than one data source and observer affect the reliability of the study (Büyüköztürk et al., 2013, p. 245). While validity is a concept related to the measurement tool in quantitative data collection, these concepts in qualitative research are focused on the researcher since the researcher is dominant rather than the data collection tool. In other words, in qualitative research methods such as interview, the concept of reliability is explained by taking the qualifications of the researcher into consideration (Türnüklü, 2000, p.550). In order to increase the reliability of the qualitative data in the present

(10)

study conducted in line with these explanations, the opinion of a different researcher was taken in the pilot application, in the analysis and the categorization of the data. To ensure the validity of the qualitative data, interviews were recorded, and data diversification and participant confirmation works were done.

In order to ensure the validity of the data, first, data diversification was done. Collecting data of the same social phenomenon under different conditions is called data diversification (Yıldırım & Şimşek, 2016, p.352). In this study, the use of document analysis in addition to the interview technique was done for data diversification. Also, the interviews held were recorded.

Finally, participant confirmation was employed. The accuracy of the data obtained is checked with participant confirmation (Creswell, 2004, p.201). The data analyzed for this purpose were presented to the experimental group students who were interviewed, and the missing or misunderstood points were revised.

The semi-structured interview form, developed to increase reliability, was examined by the relevant field experts. In addition, the application was done to three students randomly selected from the experimental group. Then, the necessary corrections were made, and the form was finalized. The second step to ensure reliability is to examine coherency in analyzing the data and reaching the categories. For this purpose, the data were analyzed by two different researchers. The agreement percentage of the analysis made was checked. The agreement percentage should be above 80% (Creswell, 2004, p. 203). The formula for the agreement percentage is as follows.

P= Na X 100 Na + Nd

P: Agreement percentage; Na: Number of agreements; Nd: Number of disagreements (Türnüklü, 2000, s. 551).

In the present study, the content was analyzed by two different researchers. According to the analysis results, the reliability coefficient was determined as 100% for the first question, 66.6%

for the second question, 83.3% for the third question, and 75% for the last question. For the overall form, the agreement percentage was 80%.

Data Collection

The data collection process is discussed under three headings, namely “pre- implementation”, “During Implementation”, and “Post-implementation”. Participants were informed before the process and the application approvals have been received from them.

Pre-implementation process

Before beginning the study, the necessary permissions were got from the students’

parents and the Nevşehir Provincial Directorate of National Education of Turkey. Since the study practices were time-consuming and individual, a parents’ meeting was held before the study. They were informed, and their support was asked for the audio recordings that would be taken during the practices. The study was explained in detail in the parents’ meeting and student interviews. The students were informed about what the reading speed should be, what are considered reading mistakes and reading prosody. After listening to sample reading texts, the whole class made criticisms on this subject, and ideas about good reading were shared.

It was decided to conduct the study as five days on weekdays and one day on weekends. The study was planned as one period every weekday and six periods at the weekend.

(11)

The study was planned for 10 weeks. In the first week, parents and students were going to be informed. Then, the pretest was going to be administered. After this, the reading practices were going to be done for four weeks, mid-evaluation was going to be administered in the 5th week, and the reading practices were going to continue until the last week. In the last week, the posttest was going to be administered, and student views were going to be taken. During the study, students’ written views were going to be taken once.

The pre-implementation process was completed before the semester break. The pretest was administered to the groups in this process, and the groups and individuals’ levels were determined after doing the necessary statistical works on the data. The pre-implementation was done using the text named “The Boy with Rope Legs” which was included in the 4th grade Turkish textbook in the 2017-2018 academic year and the 18 multiple-choice questions developed based on this text. The readings were recorded, and the reading speed, accuracy, and prosody were determined by listening to the recordings. The comprehension levels were determined with the reading comprehension test.

Implementation process

The nine weeks of the implementation was carried out in the second semester of the 2018-2019 academic year. The implementation with the study group was conducted by the researcher. In the first four weeks of the study, excluding the pre-implementation process, self- evaluation-based oral reading practices were conducted. Practices were mostly done one by one in the library of the school. The following steps were followed in the self-evaluation-based oral reading practices: 1. The student reads the text aloud, 2. The student’s reading is recorded, 3.

The student listens to his or her own reading, 4. The student is asked to evaluate his or her own reading (Speed, accuracy, prosody), 5. The student is asked to identify the reading mistakes and mark them in the reading text, 6 The necessary support is provided at the points where the student has difficulty, 7. Then, the oral and silent readings are continued by paying attention to the mistakes made, 8. The reading of the student is recorded again and the above process is repeated, 9. The student continues these practices until they remedy their deficits, 10. The student whose deficits are remedied passes on to the new reading text.

In the 6th week, the mid-evaluation was administered to the experimental group for the study.

The mid-evaluation was done using the text named “Immortalized Garden” which was included in the 4th grade Turkish textbook in the 2018-2019 academic year and the 18 multiple-choice questions developed based on this text. The readings were recorded, and the reading speed, accuracy, and prosody were determined by listening to the recordings. The comprehension levels were determined with the reading comprehension test. Statistical works were performed on the data, and the level of the students compared to the previous test was determined. In addition, this week students were given blank A4 papers and asked to write their thoughts on the work done during these first six weeks. The study continued for four more weeks using different texts. The practices were not limited to the school but also continued at home with the support of parents. In this process, no additional activity was applied to the control group. The control group’s Turkish classes continued in accordance with the regular program.

Post-implementation process

In the last week of the implementation, the posttest was administered to the experimental and control groups. The readings were recorded, and the reading speed, accuracy, and prosody were determined by listening to the recordings. With the reading comprehension test, the reading comprehension levels of both groups were determined. In addition, interviews

(12)

were made with the experimental group, and their views on the work were determined. The interviews were recorded with a voice recorder.

Data Analysis

In the quantitative dimension of the study, reading comprehension tests, informal reading inventory, and prosodic reading scale were used. The data obtained from these data collection tools were analyzed as follows.

During the reading comprehension test development process, the data was analyzed using the SPSS 18 program. Descriptive statistics methods were used to evaluate the implementation results of the developed tests. Using non-parametric tests in cases where the group size is less than 30 and the distribution structure of the universe is not known exactly is recommended (Demirgil, 2016, p.85). In this study, nonparametric tests were used as the study group consisted of 10 students. The significance level for the whole study was accepted as p≤ 0.05. Mann- Whitney U test was used when comparing the pretest and posttest scores, correct readings, reading speeds, and reading prosodies of the experimental and control groups. This test is used when comparing two independent samples with a small number of subjects according to one independent variable. It is the nonparametric equivalent of the independent t-test (Ekiz, 2013, p.152). In addition, the Mann-Whitney U test was performed to measure the reading fluency and comprehension skills of the students in the experimental group according to gender.

The Wilcoxon signed-rank test is used to test whether the difference between measurements obtained from the two connected samples is significant. It is the nonparametric equivalent of the dependent t-test (Ekiz, 2013, p.154). In the study, it was used to compare the pretest and mid-evaluation reading comprehension scores, reading speeds, correct reading, and reading prosody of the experimental group. Similarly, it was used in experimental group’s mid- evaluation, posttest studies, and pre-post test comparisons. The Wilcoxon signed-rank test was also used to compare control group’s pre-post test scores, reading speed, correct reading, and reading prosody.

In the analysis of qualitative data, written and oral views of the students in the experimental group were used, and document analysis was examined. Document review involves the analysis of materials including the information about the phenomenon or facts planned to be examined. Document analysis can be used alone in a study or together with other methods to increase the validity of the study (Yıldırım & Şimşek, 2016, p.189). In this study, document analysis was used to increase the validity of the interview method. Students’ written views used during document analysis.

Content analysis method was employed during the analysis of these data. In content analysis, the presence of certain words or concepts in the text or texts is examined. Based on these words or concepts, the researcher makes inferences about the message in the text (Büyüköztürk et al, 2013, p.240). For this purpose, the researcher listened and read the data he collected. Then, the data was encoded. The coding process was conducted by both the researcher and a Turkish Language teacher. The agreement percentage between the two codings was determined as 80%.

The codes were then divided into main themes and sub-themes associated with them. Themes were interpreted as the last step.

(13)

Findings

Table 1. U test Results of the Pretest and Posttest Correct Reading Scores of the Experimental and Control Group Students

N Mean Rank Sum of Ranks U P

Experimental

Group 10 12.35 123.50

31.50 0.159*

Control Group 10 8.65 86.50

Experimental

Group 10 14.30 143.00

12.00 0.003*

*p<0.05 significance level

According to Table 1, there wasn’t statistically significant difference between the pretest correct reading scores of the experimental and control groups (U=31.50, p*>0.05), whereas a significant difference was found when the posttest correct reading scores were compared (U=12.00, p* <0.05).

Table 2. Wilcoxon Signed-Rank Test Results of Pretest and Posttest Correct Reading Scores of the Experimental and Control Group Students

Experimental Group

Posttest-

Pretest N Mean Rank Sum of Ranks Z P

Negative

Rank 0 0.00 0.00

-2,673 0.008*

Positive

Rank 9 5.00 45.00

Ties 1

Control Group

Posttest-

Negative

Rank 1 2.00 2.00

-2.625 0.009*

Positive

Rank 9 5.89 53.00

Ties 0

According to Table 2, there was a significant difference between experimental and control group students’ pretest and posttest correct reading scores (z=2.673, p<0.05; (z=2.625, p<0.05).

However, the experimental group students increased their scores more compared to the control group students.

(14)

Table 3. U test Results of the Pretest and Posttest Reading Speed Scores of the Experimental and Control Group Students

Experimental

Group 10 13.00 130.00

25.00 0.58*

Experimental

Group 10 13.45 134.50

20.50 0.026*

According to Table 3, there was not a statistically significant difference between the pretest reading speed scores of the experimental and control groups (U=25, p*>0.05), whereas a significant difference was found between the experimental group and the control group when their posttest reading speed scores were compared (U=20.50, p* <0.05). Based on these findings, it has seen that the experimental and control groups were equal before the implementation, and their reading speed scores differed significantly in favor of the experimental group after the implementation.

Table 4. Wilcoxon Signed-Rank Test Results of Pretest and Posttest Reading Speed Scores of the Experimental and Control Group Students

Experi mental Group

Posttest-Pretest N Mean Rank Sum of Ranks Z P

Negative Rank 3 2.50 7.50

-2,040 0.041*

Positive Rank 7 6.79 47.50

Ties 0

Control Group

Posttest-Pretest N Mean Rank Sum of Ranks Z P

-2.608 0.009*

Ties 0

According to Table 4, there is a significant difference between experimental and control group students’ pretest and posttest reading speed scores (z=2.040, p<0.05; (z=2.608, p<0.05).

Table 5. U test Results of the Pretest and Posttest Prosodic Reading Scores of the Experimental and Control Group Students

Pretest

Experimental

Group 10 12.30 123.00

32.00 0.173*

Posttest

Experimental

Group 10 13.45 134.50

20.50 0.025*

According to Table 5, there was not a statistically significant difference between the pretest prosodic reading scores of the experimental and control groups (U=32, p*>0.05), whereas a

(15)

significant difference was found when the posttest prosodic reading scores were compared (U=20.50, p* <0.05). Based on these findings, it can be said that the experimental and control groups were equal before the implementation in terms of reading prosodies, and their prosodic reading scores differed significantly in favor of the experimental group after the implementation.

Table 6. Wilcoxon Signed-Rank Test Results of Pretest and Posttest Prosodic Reading Scores of the Experimental and Control Group Students

Experimental Group

Posttest- Pretest

N Mean Rank Sum of Ranks Z P

Negative

Rank 0 0.00 0.00

-2,805 0.005*

Positive

Rank 10 5.50 55.00

Ties 0

Control Group

Posttest-

Negative

Rank 2 1.50 3.00

-2.314 0.021*

Positive

Rank 7 6.00 42.00

Ties 1

According to Table 6, there was a significant difference between the pretest and posttest prosodic reading scores of the experimental and control groups (z=2.805, p*>0.05); z=2.314, p<0.05). However, the experimental group students’ scores increased more. Based on this finding, it can be said that the experimental group students increased their scores more and that the self-evaluation-based oral reading studies were more effective on prosodic reading scores.

Table 7. U test Results of the Pretest and Posttest Reading Comprehension Scores of the Experimental and Control Group

Pretest

Experimental

Group 10 11.55 115.50

39.50 0.426*

Posttest

Experimental

Group 10 14.25 142.50

12.50 0.004*

According to Table 7, there was not a statistically significant difference between the pretest reading comprehension scores of the experimental and control groups (U=39.50, p*>0.05), whereas a significant difference was found when the posttest reading comprehension scores were compared (U=12.50, p* <0.05). Based on these findings, it can be said that the experimental and control groups were equal before the implementation in terms of reading

(16)

comprehension, and their reading comprehension scores differed significantly in favor of the experimental group after the implementation.

Table 8. Wilcoxon Signed-Rank Test Results of Pretest and Posttest Reading Comprehension Scores of the Experimental and Control Group Students

Experimenta l Group

-2,601 0.009*

Ties 0

Control Group

-0.237 0.812*

Ties 1

According to Table 8, there was a significant difference between the pretest and posttest reading comprehension scores of the experimental group (z=2.601, p<0.05), whereas there was no significant difference between the pretest and posttest scores of the control group (z=0.237, p>005). Based on these results, it can be said that the self-evaluation-based oral reading works were effective on reading comprehension.

Table 9. Wilcoxon Signed-Rank Test Results of Pretest, Mid-Evaluation and Posttest Correct Reading Scores of the Experimental Group Students

Mid- Evaluation- Pretest

-1.294 0.196*

Ties 2

Posttest- Pretest

-2.673 0.008*

Ties 1

Posttest-Mid- Evaluation

-2.687 0.007*

Ties 1

According to Table 9, there was not a significant difference between the mid-evaluation and pretest correct reading scores of the experimental group students (z=-1.294, p>0.05), whereas there was a significant difference between the posttest and pretest correct reading scores (z=- 2.673, p<0.05), and there was also a significant difference between posttest and mid-evaluation correct reading scores (z=-2.687, p<0.05). Based on these findings, it can be said that the self- evaluation-based oral reading method implemented to the experimental group students did not have an effect on correct reading short-term but had an effect on correct reading with long-term practices.

(17)

Table 10. Wilcoxon Signed-Rank Test Results of Pretest, Mid-Evaluation and Posttest Reading Speed Scores of the Experimental Group Students

-2.805 0.005*

Ties 0

Posttest- Pretest

-2.040 0.041*

Ties 0

-2.803 0.005*

Ties 0

According to Table 10, there was a significant difference between the mid-evaluation and pretest reading speed scores of the experimental group students (z=-2.805, p>0.05), a significant difference between the posttest and pretest reading speed scores (z=-2.040, p<0.05), and a significant difference between posttest and mid-evaluation reading speed scores (z=- 2.803, p<0.05). Based on these findings, it can be said that the self-evaluation-based oral reading method implemented to the experimental group students had an effect on their reading speed throughout the implementation.

Table 11. Wilcoxon Signed-Rank Test Results of Pretest, Mid-Evaluation and Posttest Prosodic Reading Scores of the Experimental Group Students

-2.199 0.028*

Ties 0

Posttest- Pretest

-2.805 0.005*

Ties 0

-2.805 0.005*

Ties 0

According to Table 11, there was a significant difference between the mid-evaluation and pretest prosodic reading scores of the experimental group students (z=-2.199, p>0.05), a significant difference between the posttest and pretest prosodic reading scores (z=-2.805, p<0.05), and a significant difference between posttest and mid-evaluation prosodic reading

(18)

scores (z=-2.805, p<0.05). Based on these findings, it can be said that the self-evaluation-based oral reading method implemented to the experimental group students had an effect on their prosodic reading throughout the implementation.

Table 12. Wilcoxon Signed-Rank Test Results of Pretest, Mid-Evaluation and Posttest Reading Comprehension Scores of the Experimental Group Students

Negative

Rank 3 5.50 16.50

-1.112 0.262*

Ties 0

Negative

Rank 1 2.00 2.00

-2.601 0.009*

Ties 0

Negative

Rank 3 4.00 12.00

-1.580 0.114*

Ties 0

According to Table 12, there wasn’t a significant difference between the mid-evaluation and pretest reading comprehension scores of the experimental group students (z=-1.112, p>0.05), and there was not a significant difference between the posttest and mid-evaluation reading comprehension scores (z=-1.580, p<0.05). However, there was a significant difference between posttest and pretest reading comprehension scores (z=-2.601, p<0.05). Based on these findings, it can be said that the self-evaluation-based oral reading method implemented to the experimental group students did not have an effect on reading comprehension short-term but had an effect on reading comprehension with long-term practices.

Figure 2. Themes and Sub-Themes Regarding Student Views on the Self-Evaluation-Based Oral Reading Method

According to Figure 2, four main themes and sixteen sub-themes related to these themes were revealed based on students’ views on the self-evaluation-based oral reading method: 1-Reading

(19)

fluency skill (accuracy, speed, prosody, comprehension), 2- Motivation (being interesting, enjoyment), 3- Self-evaluation (feedback, dispiritedness, being ashamed, being judged), 4- Reflection of the instruction to other areas (Math, Turkish, story reading, exams, hosting, poetry reading).

Findings Regarding the Improvement in Reading Fluency Skills

Students’ views on the first question focused mainly on accuracy, speed, prosody, and comprehension. In their written views, the experimental group students stated that the practices helped them read accurately, also in their interviews the same views were expressed. The view of the student with code A5 is presented below.

In the beginning, I had a lot of reading mistakes. The practices done were effective in correcting my reading mistakes. (A5, interview)

Experimental group students also expressed that the practices done increased their reading speed. In their interviews, the students said that the practices helped them read faster, and in their written views stated the same. The view of the student with code A8 is given below.

It helped me like this. I used to read slowly, now I read a little faster. I read the subtitles on TV faster. (A8, written student view)

The experimental group students stated that they also saw the effect of the practices on their prosodic reading. The students expressed that they did not pay enough attention to the punctuation marks before the study and that they did not reflect the emotion of the piece to their reading. At the end of the practices, in their written views, the students wrote that they made progress in this area, and in their interviews, said the same. A10 coded student’s view on the effect of the practices on reading prosody is presented below.

I read emotionally, paying attention to punctuation. I’m not breaking down words that don't fit at the end of the line. I read better now. (A10, written student view)

In addition, experimental group students expressed that the practices done had a favorable effect on their reading comprehension. The students shared this in their written student views and in their interviews. The students stated that they begin to think about their reading and that this made it easier for them to comprehend. A7 coded student’s view on the effect of the practices on her reading comprehension is given below.

My reading has improved. I can now read by guessing and I understand better. (A7, interview)

Findings Regarding the Increase in Reading Motivation

Regarding the second question about the student views, the students stated that there was nothing they had difficulty with. On top of not having difficulty, students expressed that the practices done were interesting and they liked them very much. The students shared this in their written views, and in their interviews. Some students shared different views. For example, A3 coded student said, “At first, I had a hard time finding my mistakes.” A2 coded student’s view on this theme is presented below.

There was nothing difficult. It was very fun work. (A2, interview)