انت هنا الان : شبكة جامعة بابل > موقع الكلية > نظام التعليم الالكتروني > مشاهدة المحاضرة

Question types in English language diagnostic testing

الكلية كلية التربية للعلوم الانسانية     القسم قسم اللغة الانكليزية     المرحلة 4
أستاذ المادة منير علي خضير ربيع       1/27/2012 3:56:34 PM
Question types in English language diagnostic
testing

English language proficiency testing, like large-scale testing in many other
domains, often uses multiple-choice questions, to exploit the efficiency of
automatic marking. An experiment supplementing established mcq tests with
very short free-text answer questions in English diagnostic testing has shown
that the latter are better discriminators at the lower end of student ability.
Although not economic with paper-based marking on a large scale, the
Assess By Computer e-assessment software offers marking options for such
answers which make constructed-answer tests a realistic option.

Constructed vs selected answers in large-scale testing

In many domains, there is a need for quick and efficient large-scale testing of
straightforward material. Selected answer1 tests can be automatically marked,
and are thus widely used (for example, in the UK, for the DVLA automobile
Driving Theory test.)
Diagnostic testing of English language proficiency is another such domain,
with global scope. Locally at the University of Manchester (UoM), the
University Language Centre (ULC) tests over 1,000 students per academic
year - around 850 in a single week each September - to assess their linguistic
ability to follow an academic course. (This is additional to the standard TOEFL
/ IELTS admission requirements.)
The UoM ULC tests consist mainly of mcq’s, as described below. However,
selected-answer questions tightly constrain the extent to which a candidate
can give evidence of incompetence, even in this intellectually limited domain.
Our hypothesis was that, given the chance to answer freely, the weakest
1 We reserve the term “objective” to mean questions to which the answer, rather than the
marking judgement, is a matter of objective fact - as opposed to “subjective”. Objective
questions can require constructed answers, and subjective questions selected answers
(“Give your opinion on a scale from 1 to 5 …”)
candidates would give evidence of greater weakness than could be seen from
MCQ results. This was confirmed by the data.

English language diagnostic testing

The test employed by UoM ULC for English language diagnostic testing is the
Chaplen Speeded Grammar and Vocabulary Test (Chaplen, 1970), which has
been used at Manchester University since the early 1970s. It is a well tried
and tested gauge of a learner’s knowledge of the English language system
and its formal or “educated” vocabulary. The total number of correct answers
is presented as a percentage score. The test discriminates well at the upper
intermediate and advanced levels of language proficiency, with students at
these levels typically attaining scores ranging from 50% - 90%. A score of
more than 90% indicates that the learner is approaching native speaker level.
Below intermediate levels (approx 40%), however, it is not a useful instrument
as it begins to lose its discriminatory power. High marks on the Chaplen
typically correlate well with the number of years studying English as a foreign
language in formal settings, though the strength of this relationship has not
been tested.
Originally developed in the late 1960s, the test reflects the structuralist
description of language and methods of language testing. Using a multiple
choice format, the Grammar (10 mins) and Vocabulary (18 mins) sections test
students knowledge of a range of individual items of structure and lexis in
“everyday educated English” (Chaplen, 1970: 174). For each section, there
are 50 questions, each consisting of a sentence with a word or phrase
omitted. The test taker must choose the correct filler from the list of possible
answers provided. There is a choice of three possible answers in the
Grammar section and five possible answers in the Vocabulary section. The
short amount of time allowed for each section means that students work
under considerable pressure of time and only the more proficient students
manage to complete all the questions. The test is quick to administer and
quick to mark. This is one of its major advantages, since it permits the rapid
processing of very large numbers of students at low cost. Combining this
rapidity of administration with an OMR marking system means that up to 1000
students can be tested and given their mark within a few days.
The theoretical assumption which underpins the use of the test at Manchester
is that adequate knowledge of the general language system can serve as a
reliable indication of a student s ability to apply this knowledge in academic
situations. Its principal use is to identify recently arrived overseas students
who would benefit from attending classes in academic writing provided by the
University, or who will probably experience difficulties in their academic work
due to less than adequate levels of English language proficiency in reading
and writing. A score 40% or less, broadly indicates that a student has an
inadequate level of English language proficiency for academic study. The
extensive trials that the test underwent during its development would appear
to support this (Chaplen, 1970). In addition, in two follow-up studies, the test
has been shown to have reasonable predictive validity (James, 1980; O’Brien,
1993).
Because Chaplen is basically a test of discrete item recognition, it has to be
complemented by a piece of continuous writing. The writing test consists of
three questions to which short “essay” answers are expected, to be completed
in 30 minutes. Morley (2000), who made a number of improvements to the
test, has shown that the test scores correlate quite strongly with assessment
of students continuous writing using trained assessors2.
Despite all its advantages, it needs to be emphasised that Chaplen is not a
test of language production, and it is not a test of language skills. In fact, it
assesses a fairly narrow aspect of language competence through the
recognition of correct lexical and grammatical choices provided as part of an
artificially restricted set of choices. In this sense, it is less of a finely tuned
instrument than the much more sophisticated, and much more expensive,
internationally recognised university entry tests (eg IELTS and TOEFL) which
take very much longer but which also test a broad range of language skills.
Furthermore, despite the strong correlations with writing scores mentioned
above, it is still not uncommon to come across cases of good spoken and
written communicators who do not score well on Chaplen, and of good
Chaplen scorers who are not good communicators. Finally, because of the
multiple choice design, the discriminatory power of the test below a certain
level is weak (around 40%) and even non–existent (around 25%). We
therefore sought ways of maintaining the efficiency of the instrument, whilst at
the same time endeavoring to measure students’ ability to produce correct
language rather then simply to choose it. The aim was to fine-tune the
instrument and to increase its discriminatory power without any loss of
efficiency.

The experimental tests
Our hypothesis was that, if a practical way could be found of testing with freetext
questions - even with single-word answers - (a) all the students would be
more effectively challenged by what would become, in effect, a production
rather than a recognition task; (b) the weakest students would make more
extreme errors than any of the mcq distractors, and we would thus have more
effective discrimination at the bottom end of the range. The ABC (Assess By
Computer) e-assessment software (Sargeant et al 2004) developed at UoM
looked promising, and has been used in (to date) two trial runs of free-text
question tests, with a third scheduled.
The original UoM English language proficiency diagnostic assessment, as
described above, consists of three separately timed tests3: “Grammar and

Usage” (10 minutes), “Vocabulary” (18 minutes), and “Writing” (30 minutes).
These tests were set up in the ABC software and first taken in this form in
February 2007 by 23 students (January entrants to postgraduate programs in
the School of Computer Science, UoM).
Although the students were unfamiliar with the software, none showed any
signs of difficulty in using it, and results were as expected from previous
experience with similar groups, i.e. there was no evidence of bias caused by
use of the software. The clear difference lay in the speed of marking. The
MCQ tests were marked automatically, with marking complete within minutes
of the last student submitting their answers, rather than waiting several days
for a scanning service. The answers to the “writing” test were output as a pdf
file and marked on paper: the saving here lay in the greater ease of reading
typescript than handwriting.


المادة المعروضة اعلاه هي مدخل الى المحاضرة المرفوعة بواسطة استاذ(ة) المادة . وقد تبدو لك غير متكاملة . حيث يضع استاذ المادة في بعض الاحيان فقط الجزء الاول من المحاضرة من اجل الاطلاع على ما ستقوم بتحميله لاحقا . في نظام التعليم الالكتروني نوفر هذه الخدمة لكي نبقيك على اطلاع حول محتوى الملف الذي ستقوم بتحميله .