The ELT Blog: Language Assessment

Showing posts with label Language Assessment. Show all posts

Wednesday, 27 December 2017

The Need for Assessment Literacy among Teachers

The need for teachers' assessment literacy is on the increase due to the increasing responsibilities placed on teachers with regard to assessment. Ongoing classroom assessment requires the teacher to be aware of the technicalities of assessment. Not only that, the teacher needs to know how to design a good assessment for her classroom. This might be too much expectation, considering the fact that teachers do not receive much training in this regard during teacher training, and that time is not abundantly available for a teacher. Unless in-service training and personal interest combined with institutional and peer support work hand in hand, teacher may not be able to do this effectively. Also, leaving the teacher out of the circle of assessment will be further widening the gap between theory and practice in the larger discipline called pedagogy- one of the greatest pedagogical tragedies of all times!

Teacher in Class: Source

The gravity of the fact becomes clearer if we understand that any decision we make at the policy level or administrative level cannot reach the learner without the teachers' sound understanding. So if assessment is well planned and designed by someone other than the teacher, its implementation may not succeed at all, because in the classroom, it is the teacher who has to face the ground reality. If the teacher doesn't understand why the assessment is designed the way it is designed, its implementation may miss the point! Moreover, teacher is the most important element in pedagogy after the learner. It is the teacher who bridges the curriculum objectives and the learner. Therefore, the teacher needs to have a sound understanding of how assessment is designed, implemented and scored.

What works better? One time teacher training or ongoing teacher engagement? Certainly, the latter. It is because ongoing teacher development does not take the teacher away from her teaching context. She can practice and experiment with what she learns in teacher development programmes. The former on the other hand will be a stand alone programme that doesn't connect with teaching contexts or students. It is quite possible that what one learns during a stand alone programme remains unused. For the same reason, it is important that teachers remain active within dynamic professional communities that update themselves about the latest developments in pedagogy. Healthy relationships among teachers can lead to sharing of resources and ideas, and can result in action research with much prospect.

Since teaching and assessment go hand in hand, and assessment feeds back into instruction, a teacher has to have at least the basic know how of assessment. Lack of this understanding will lead to inefficient tests in class, inappropriate interpretation of test scores, and pointless instruction that doesn't reflect in assessment. If teacher has to give quality feedback, she must understand assessment. Hence the importance of assessment literacy.

Teachers should have hands-on experience of identifying learners' learning goals, stating that in clear terms for test development, stating learning outcomes, proposing what authentic assessment is, writing tasks that fit the cultural and social contexts of learners, writing rubrics for assessment, and defining criteria for rating. This can be achieved through practice, while in touch with professional companions, and professional development programmes. How we do it is not as important as the fact that we do it no matter what- because, teacher's awareness of assessment is as important as her awareness of teaching techniques!

Saturday, 5 August 2017

Reliability and Validity in Language Assessment

Both reliability and validity are important for a language test to be useful.

Reliability

Reliability in other words is consistency. It is like a weighing scale's reliability. A weighing scale must show the same weight of the same object on all occasions. If a test that gives me an A grade today must give me something similar a month from now also. Or a test that gives A grades to a group of students of the same ability must give about the same scores in a few weeks' time. That is, the test must be reliable. If a test gives A grade today, and F (fail) grade tomorrow, then the test is not reliable. If tests are not reliable, they are not useful. They will not provide us with any information about the test-taker. Therefore, our attempts must be to minimize the effects of the potential sources of inconsistency in the test.

Validity

Validity implies the meaningfulness and appropriateness of the interpretations we make based on a test score. Validity is when we are indeed testing what we intend to test. Validity is when we are confidently able to interpret the test score as a representation of the test-taker's underlying language ability we measured in the test. If there is no validity, we cannot generalize our interpretations to the Target Language Use (TLU) domain. If we can't generalize a test score to other domains, it is not very useful. In other words, without validity, tests are useless.

To ensure validity, we must look at the characteristics of the test task and the construct definition. Test task characteristics are important because they must match with the TLU domain tasks' characteristics. They must test the test-takers' language ability. This is possible only when you have defined the construct to be measured in clear terms.

Saturday, 3 June 2017

3 Basic Problems of Language Assessment

The three basic problems of assessment are

inference
prediction and
generalisation.

1. Inference

Inference is what we do with the performance of a test-taker. We make inferences about the language abilities of the test-taker based on how the test-taker has performed on the assessment in question. The problem is that to make this inference, we make a lot of assumptions about language and language performance. We assume that language performance comes from underlying language competence or underlying language abilities. We also assume that language abilities have particular structure, and that different language abilities interact with each other. Based on these assumptions, we design our tests to systematically sample test-taker's language. This sample is used as our raw material to make inferences about the underlying language competence or abilities.

The real problem is that we do not for sure know what is the relationship between performance and abilities/competence or how abilities are structured or how they interact with each other. All these questions are still not satisfactorily answered.

2. Prediction

Prediction is to say in advance how language abilities will be used in future, in actual situations in real time. A good test will have high potential to predict actual performance. Abilities will interact with other performance conditions like physical conditions, affective factors, etc. during performance. Therefore, prediction has to consider these influences.

The problem is that if a test cannot make accurate predictions or if it makes wrong predictions, the decisions made based on the test will have serious consequences.

3. Generalisation

Generalisation is about applying the prediction to other contexts of language use as well. This is an important quality since a language test must be able to talk about a learner's language use in many situations and language use contexts. Otherwise a test has very limited relevance. Tests generally characterise different contexts so that we know what is different in various contexts. Using this information, we can apply the prediction based on a test to other contexts too.

The problem is when a test doesn't have the capacity to generalise. The test will be highly parochial, and its relevance will be too localised.

A larger problem

Each of the above problems conceptualise language sampling in different ways. Therefore, no single approach to language sampling is possible. With any particular approach, we cannot solve the problems mentioned above. These are the approaches available today which have their own independent focuses.

1. Abilities approach: This approach uses a model of communicative competence. Abilities underlie performance. Therefore this approach tries to build tests to elicit performance based on particular underlying abilities. Here language processes and contexts are treated as secondary extensions.

2. Processing approach: Processing approach gives centrality to language processing. Therefore, real-time language processing in communication is the focus of assessment. Tests will assess how well test-takers can handle the pressures of communication online. In this approach, abilities play only a secondary, service role. Therefore, assessment will focus on a sampling framework that looks at performance conditions. Generalisation thus will be to other language use contexts that use the same kind of language processing only. Again, this is limited.

3. Contextually driven approach: In this approach, difference between different contexts is the focus. Assessment is focused on the characteristics of contexts. Test sampling therefore focuses on covering a range of contexts so that generalisation of prediction is meaningful to those contexts.

Solution to these problems

1. Develop a model of underlying abilities

2. Develop direct performance tests combining performance and contextual problems

The first solution could be systematic in portraying language abilities. This would align with empirical methods used today in measuring abilities. Therefore, this approach could further the existing scholarship in the field. The process would be defining language constructs, then gather data to assess constructs, then assemble effective tests.

Its problem is that is presents a static picture of proficiency. Since it assumes that there are underlying abilities, and is trying to uncover them, it is a difficult endeavour especially because we are not sure of what these underlying abilities are, or what are their connections with performance!

The second solution has greater predictive quality because of the restricted situations it deals with. It emphasises context validity by looking into characteristics of contexts and performance.

But the problem is that it deals with limited number of contexts or domains. Validation depends on needs analysis, which in turn depends on assessment, and vice versa. It works pretty much like ESP (English for Specific Purpose) tests. Generalisation is limited. Moreover, there is little underlying theory to explain the differences in contexts and performance in relation to abilities.

Therefore, the former 'interactive ability' model seems to be the better choice in terms of prediction and generalisation capacities.