LSP Testing
The development of specific purpose testing, i.e., tests in which the test content and test method are derived from a particular language use context rather than more general language use situations, can be traced back to the Temporary Registration Assessment Board (TRAB), introduced by the British General Medical Council in 1976 (see Rea-Dickins, 1987) and the development of the English Language Testing Development Unit (ELTDU) scales (Douglas, 2000).The 1980s saw the introduction of English for Academic Purposes (EAP) tests and it is these that have subsequently dominated the research and development agenda. It is important to note, however, that Language for Specific Purposes (LSP) tests are not the diametric opposite of general purpose tests. Rather, they typically fall along a continuum between general purpose tests and those for highly specialised contexts and include tests for academic purposes (e.g., the International English Language Testing System, IELTS) and for occupational or professional purposes (e.g., the Occupational English Test, OET). Douglas (1997, 2000) identifies two aspects that typically distinguish LSP testing from general purpose testing.The first is the authenticity of the tasks, i.e., the test tasks share key features with the tasks that a test taker might encounter in the target language use situation. The assumption here is that the more closely the test and real-life tasks are linked, the more likely it is that the test takers performance on the test task would reflect their performance in the target situation. The second distinguishing feature of LSP testing is the interaction between language knowledge and specific content knowledge.This is perhaps the most crucial difference between general purpose testing and LSP testing, for in the former, any sort of background knowledge is considered to be a confounding variable that contributes construct-irrelevant variance to the test score. However, in the case of LSP testing, background knowledge constitutes an integral part of what is being tested, since it is hypothesised that test takers language knowledge has developed within
the context of their academic or professional field and that they would be disadvantaged by taking a test based on content outside that field. The development of an LSP test typically begins with an in-depth analysis of the target language use situation, perhaps using genre analysis (see Tarone, 2001). Attention is paid to general situational features such as topics, typical lexis and grammatical structures. Specifications are then developed that take into account the specific language characteristics of the context as well as typical scenarios that occur (e.g., Plakans & Abraham, 1990; Stansfield et al, 1990; Scott et al, 1996; Stansfield et al, 1997; Stansfield et al., 2000). Particular areas of concern, quite understandably, tend to relate to issues of background knowledge and topic choice (e.g.,Jensen & Hansen, 1995; Clapham, 1996; Fox et al, 1997; Celestine & Cheah, 1999; Jennings et al, 1999; Papajohn, 1999; Douglas, 2001a) and authenticity of task, input or, indeed, output (e.g., Lumley & Brown, 1998; Moore & Morton, 1999; Lewkowicz, 2000; Elder, 2001; Douglas, 2001a; Wu & Stansfield; 2001) and these areas of concern have been a major focus of research attention in the last decade. Results, though somewhat mixed (cf. Jensen & Hansen, 1995 and Fox et al, 1997), suggest that background knowledge and language knowledge interact differently depending on the language proficiency of the test taker. Clapham s (1996) research into subject- specific reading tests (research she conducted during and after the ELTS revision project) shows that, at least in the case of her data, the scores of neither lower nor higher proficiency test takers seemed influenced by their background knowledge. She hypothesises that for the former this was because they were most concerned with decoding the text and for the latter it was because their linguistic knowledge was sufficient for them to be able to decode the text with that alone. However, the scores of medium proficiency test takers were affected by their background knowledge. On the basis of these findings she argues that subject-specific tests are not equally valid for test takers at different levels of language proficiency. Fox et al. (1997), examining the role of background knowledge in the context of the listening section of an integrated test of English for Academic Purposes (the Carleton Academic English Test, CAEL), report a slight variation on this finding. They too find a significant interaction between language proficiency and background knowledge with the scores of low proficiency test takers showing no benefit from background knowledge. However, the scores of the high proficiency candidates and analysis of their verbal protocols indicate that they did make use of their background knowledge to process the listening task. Clapham (1996) has further shown that background knowledge is an extremely complex con
cept. She reveals dilemmas including the difficulty of identifying with any precision the absolute specificity of an input passage and the nigh impossibility of being certain about test takers background knowledge (particularly given that test takers often read outside their chosen academic field and might even have studied in a different academic area in the past). This is of particular concern when tests are topicbased and all the sub-tests and tasks relate to a single topic area. Jennings et al. (1999) and Papajohn (1999) look at the possible effect of topic, in the case of the former, for the CAEL and, in the case of the latter, in the chemistry TEACH test for international teaching assistants. They argue that the presence of topic effect would compromise the construct validity of the test whether test takers are offered a choice of topic during test administration (as with the CAEL) or not. Papajohn finds that topic does play a role in chemistry TEACH test scores and warns of the danger of assuming that subject-specificity automatically guarantees topic equivalence. Jennings et al. are relieved to report that choice of topic does not seem to affect test taker performance on the CAEL. However, they do note that there is a pattern in the choices made by test takers of different proficiency levels and suggest that more research is needed into the implications of these patterns for test performance. Another particular concern of LSP test developers has been authenticity (of task, input and/or output), one example of the care taken to ensure that the test materials are authentic being Wu and Stansfield s (2001) description of the test construction procedure for the LSTE-Taiwanese (listening summary translation exam). Yet Lewkowicz (1997) somewhat puts the cat among the pigeons when she demonstrates that it is not always possible accurately to identify authentic texts from those specially constructed for testing purposes. She further problematises the valuing of authenticity in her study of a group of test takers perceptions of an EAP test, finding that they seemed unconcerned about whether the test materials were situationally authentic or not. Indeed, they may even consider multiple-choice tests to be authentic tests of language, as opposed to tests of authentic language (Lewkowicz, 2000). (For further discussion of this topic, see Part Two of this review.) Other test development concerns, however, are very much like those of researchers developing tests in different sub-skills. Indeed, researchers working on LSP tests have contributed a great deal to our understanding of a number of issues related to the testing of reading, writing, speaking and listening. Apart from being concerned with how best to elicit samples of language for assessment (Read, 1990), they have investigated the influence of interlocutor behaviour on test takers performance in speaking tests (e.g., Brown & LunJey, 1997; McNamara & Lumley, 1997; Reed & Halleck, 1997). They have also studied the assumptions underpinning rating
scales (Hamilton et al., 1993) as well as the effect of rater variables on test scores (Brown, 1995; Lumley & McNamara, 1995) and the question of who should rate test performances — language specialists or subject specialists (Lumley, 1998). There have also been concerns related to the interpretation of test scores. Just as in general purpose testing, LSP test developers are concerned with minimising and accounting for construct-irrelevant variables. However, this can be a particularly thorny issue in LSP testing since construct irrelevant variables can be introduced as a result of the situational authenticity of the test tasks. For instance, in his study of the chemistry TEACH test, Papajohn (1999) describes the difficulty of identifying when a teaching assistant s teaching skills (rather than language skills) are contributing to his/her test performance. He argues that test behaviours such as the provision of accessible examples or good use of the blackboard are not easily distinguished as teaching or language skills and this can result in construct-irrelevant variance being introduced into the test score. He suggests that test takers should be given specific instructions on how to present their topics, i.e., teaching tips so that teaching skills do not vary widely across performances. Stansfield et al. (2000) have taken a similar approach in their development of the LSTETaiwanese. The assessment begins with an instruction section on the summary skills needed for the test with the aim of ensuring that test performances are not unduly influenced by a lack of understanding of the task requirements. It must be noted, however, that, because of the need for in-depth analysis of the target language use situation, LSP tests are time-consuming and expensive to produce. It is also debatable whether English for Specific Purposes (ESP) tests are more informative than a general purpose test. Furthermore, it is increasingly unclear just how specific an LSP test is or can be. Indeed, more than a decade has passed since Alderson (1988) first asked the crucial question of how specific ESP testing could get. This question is recast in Elder s (2001) work on LSP tests for teachers when she asks whether for all their teacherliness these tests elicit language that is essentially different from that elicited by a general language test. An additional concern is the finding that construct relevant variables such as background knowledge and compensatory strategies interact differently with language knowledge depending on the language proficiency of the test taker (e.g., Halleck & Moder, 1995; Clapham, 1996). As a consequence of Clapham s (1996) research, the current IELTS test has no subject-specific reading texts and care is taken to ensure that the input materials are not biased for or against test takers of different disciplines. Though the extent to which this lack of bias has been achieved is debatable (see Celestine & Cheah, 1999), it can still be argued that the attempt to make texts
accessible regardless of background knowledge has resulted in the IELTS test being very weakly specific. Its claims to specificity (and indeed similar claims by many EAP tests) rest entirely on the fact that it is testing the generic language skills needed in academic contexts.This leaves it unprotected against suggestions like Clapham s (2000a) when she questions the theoretical soundness of assessing discourse knowledge that the test taker, by registering for a degree taught in English, might arguably be hoping to learn and that even a native speaker of English might lack. Recently the British General Medical Council has abandoned its specific purpose test, the Professional and Linguistic Assessment Board (PLAB, a revised version of theTRAB), replacing it with a two-stage assessment process that includes the use of the IELTS test to assess linguistic proficiency. These developments represent the thin end of the wedge. Though the IELTS is still a specific purpose test, it is itself less so than its precursor the English Language Testing System (ELTS) and it is certainly less so than the PLAB. And so the questioning continues. Davies (2001) has joined the debate, debunking the theoretical justifications typically put forward to explain LSP testing, in particular the principle that different fields demand different language abilities. He argues that this principle is based far more on differences of content rather than on differences of language (see also Fulcher, 1999a). He also questions the view that content areas are discrete and heterogeneous. Despite all the rumblings of discontent, Douglas (2000) stands firmly by claims made much earlier in the decade that in highly field-specific language contexts, a field-specific language test is a better predictor of performance than a general purpose test (Douglas & Selinker, 1992). He concedes that many of these contexts will be small-scale educational, professional or vocational programmes in which the number of test takers is small but maintains (Douglas, 2000:282): if we want to know how well individuals can use a language in specific contexts of use, we will require a measure that takes into account both their language knowledge and their background knowledge, and their use of strategic competence in relating the salient characteristics of the target language use situation to their specific purpose language abilities. It is only by so doing ... that we can make valid interpretations of test performances. He also suggests that the problem might not be with the LSP tests or with their specification of the target language use domain but with the assessment criteria applied. He argues (Douglas, 2001b) that just as we analyse the target language use situation in order to develop the test content and methods, we should exploit that source when we develop the assessment criteria. This might help us to avoid expecting a perfection of the test taker that is not manifested in authentic performances in the target language use situation. But perhaps the real challenge to the field is in identifying when it is absolutely necessary to know how well someone can communicate in a specific context or if the information being sought is equally obtainable through a general-purpose language test. The answer to this challenge might not be as easily reached as is sometimes presumed.
المادة المعروضة اعلاه هي مدخل الى المحاضرة المرفوعة بواسطة استاذ(ة) المادة . وقد تبدو لك غير متكاملة . حيث يضع استاذ المادة في بعض الاحيان فقط الجزء الاول من المحاضرة من اجل الاطلاع على ما ستقوم بتحميله لاحقا . في نظام التعليم الالكتروني نوفر هذه الخدمة لكي نبقيك على اطلاع حول محتوى الملف الذي ستقوم بتحميله .
|