Andrew Baxter, autor, en este artículo explica las diferencias, similitudes y retos entre la evaluación, el enseñar y los exámenes. Enseñar y evaluar van de la mano. “A menudo hacemos preguntas para asegurar que los estudiantes hayan entendido lo que hemos dicho. De la misma manera, a veces hacemos preguntas para saber si necesitamos enseñar y ahondar en algún punto”.

¿Qué es evaluar?

Cada vez que les pedimos a nuestros estudiantes que respondan una pregunta para la cual nosotros ya tenemos una respuesta, les estamos haciendo un tipo de evaluación. La mayoría de lo que hacemos en clase es, de hecho, evaluar el conocimiento de los estudiantes. Aquí algunos ejemplos:

Él va al cine. Ellos ____?
Encuentre una palabra en el texto que tenga un significado similar a “enojado”.
¿En qué parte del audio John le dice a Susan el lugar que quiere visitar?
¿Cuál es la idea principal del párrafo tres?
Dictado: Escriba lo siguiente:…
Terminada la parte de la lección, ¿qué creen que vamos a hacer a continuación?

Evaluar y enseñar

Convirtiendo el desempeño en números

Tradicionalmente, la evaluación ha medido los resultados del desempeño de los estudiantes.

Elegimos ejemplos representativos del lenguaje.
Medimos si nuestros estudiantes pueden usar estos ejemplos.
Luego lo tratamos de cuantificar al convertirlo en una nota.
Recolectamos estas notas para usarlas y dar una gran nota final.Con el tiempo, todas las teorías de evaluación (incluidas tanto en el desarrollo del lenguaje como en el desarrollo de un champú) se basan en procedimientos semi–científicos. Por ejemplo:

Medir el desempeño.
Hacer algo para afectar el desempeño.
Medir el desempeño otra vez y comparar las diferencias.

Al aplicar este procedimiento o modelo a los aprendices de una lengua significa que dichos aprendices son tratados como algún tipo de plata. Medimos la plata, luego de damos fertilizantes y luego la volvemos a medir para ver el efecto del fertilizante. Como docentes de lenguas, les damos a nuestros estudiantes una evaluación (de clasificación), luego les enseñamos y después volvemos a darles una evaluación (de progreso) para ver qué tanto han mejorado.

En otras palabras, evaluar tiene que ver con enumeración, es decir, convertir el desempeño en números.

Actividades de evaluación y actividades de enseñanza

Enseñar y evaluar van de la mano. A menudo hacemos preguntas para asegurar que los estudiantes hayan entendido lo que hemos dicho. De la misma manera, a veces hacemos preguntas para saber si necesitamos enseñar y ahondar en algún punto. De manera instintiva hacemos preguntas o para enseñar algo o para evaluar algo.

Compare los siguientes dos ejercicios:

Ejercicio 1

Complete la información con el verbo apropiado.

John _________ Francia cada año desde 1993. (Visitar)

John ___________ Francia el año pasado. (Visitar)

Ejercicio 2

En grupos, discutan la diferencia entre las siguientes dos oraciones.

John ha visitado Francia cada año desde 1993 John visitó Francia.

El ejercicio 1 asume que los estudiantes poseen algún conocimiento y les pide que lo prueben. Es claramente una actividad de evaluación. Note que si los estudiantes tienen la respuesta correcta no sabríamos el porqué de esa respuesta. Podría ser que adivinó o solo le sonó bien.

El ejercicio 2 le hace a los estudiantes una pregunta de lenguaje. En otras palabras, les está pidiendo que formulen reglas que puedan usar en otras situaciones – una teoría generalizable. También intenta hacer crecer su conciencia sobre cómo funciona el lenguaje.

La actividad intenta enseñarles algo: es una actividad de enseñanza. Por otro lado, algunos profesores podrían decir que las personas no necesitan saber por qué está correcta una actividad, solo necesitan tenerla correcta.

Ejercicio 3

Hacer un escrito: Un día de verano en la playa (150 palabras)

Ejercicio 4

Lea los siguientes dos escritos titulados “Un día de verano en la playa”. ¿Cuál prefieres?

Subraye todas las palabras e ideas relacionadas al verano. Subraye todas las palabras e ideas relacionadas a la playa. Dibuje (✓) junto a las partes que te gustaría tener en tu ensayo. Dibuja una (✗) junto a las partes que no te gustaría tener en tu ensayo.

Si todos los párrafos se mezclaran accidentalmente, ¿podrías volverlos a poner en el orden correcto? ¿Qué te ayudaría a hacerlo? Discutan sus ideas con otro grupo.

Tarea: Escriba su propio escrito acerca del mismo tema (150 palabras).

Comparemos otros dos ejercicios.

Utilizando las mismas ideas que mostramos arriba, el ejercicio 3 es claramente una evaluación: busca que los estudiantes demuestres lo que saben. El ejercicio 4, por otra parte, claramente busca que los estudiantes estén más conscientes sobre ejercicio antes de hacerlo. Trata de ayudar a los estudiantes a aprender.

¿Enseñar o evaluar?

Algunas veces, lo profesores pueden confundirse al no saber si están enseñando o evaluando. Podríamos pensar que estamos enseñando cuando de verdad solo estamos evaluando. Esto es particularmente verdad cuando tratamos de enseñar las cuatro habilidades: lectura, escritura, habla y escucha. Aquí los profesores de lengua se enfrentan a un mayor problema. La verdad no sabemos lo suficiente, es decir, no hay reglas claras para lo que es una buena escucha, lectura o las otras habilidades. Todo lo que tenemos son algunas ideas generalizadas tales como skimming y scanning y estas no son lo suficientemente detalladas para ayudarnos a trabajar un programa de enseñanza efectiva y progresiva.

En otras palabras, cuando nos enfrentamos a una habilidad que es difícil de enseñar, como por ejemplo, una buena habilidad de escucha, generalmente tenemos dos maneras de resolver el problema. O le damos a los estudiantes una gran cantidad de oportunidades de demostrar lo que saben para así nosotros determinar si han mejorado o no. Les pedimos que lean, que escriban o que escuchen textos de una gran complejidad lingüística y esperamos que mantengan los mismos resultados e incluso que mejoren; o mantenemos los mismos textos e incrementamos el nivel de complejidad de las preguntas.

Esto se asemeja a un doctor diciendo No sé qué le causó esta enfermedad o porque se está mejorando pero su fiebre está bajando. Todo lo que podemos hacer al enseñar las cuatro habilidades es exponer a los estudiantes a la lengua y medir su temperatura por medio de un examen para ver si están mejorando.

O substituimos la habilidad difícil de enseñar por una más fácil. Mientras que las reglas para las habilidades no son muy claras, tenemos muy buenas reglas para la gramática y el vocabulario lo cual hace que estas habilidades sean más fáciles de enseñar (sin embargo, hacer un examen de gramática/vocabulario puede ser complejo). De esta manera, algunas veces creemos que enseñamos o evaluamos una habilidad cuando en realidad estamos practicando o evaluando gramática o vocabulario. Por ejemplo, muchos exámenes de habla son en realidad ejercicios de gramática encubiertos: se pueden volver una evaluación oral de gramática. No evaluamos en realidad la habilidad de habla como por ejemplo evaluar la habilidad de interrumpir sin causar ofensa alguna.

¿Por qué pasa esto? Porque el modelo de evaluación semicientífico de la plata que veíamos al inicio de este articulo tiene algunos problemas mayores. En la siguiente sección cubriremos estos problemas.

Problemas con la evaluación

Problema 1: Habilidades convertidas a números

Como vimos, evaluar está basado en ideas del método científico: medir, hacer cambios, mediar una vez más y comparar.

Un problema con el método científico es que no todo se puede medir necesariamente con este modelo. Hay algunas cosas que se puede evaluar fácilmente utilizando ese modelo, por ejemplo, tercera persona del presente simple. Pero hay otras habilidades mucho más difíciles de medir, por ejemplo, ¿podríamos cuantificar la habilidad de un estudiante para hacer contribuciones útiles a la clase?

Primero, tendríamos que definir “útil” y “contribución” de tal manera que pudiéramos medirlas.
Podríamos definir “útil” como “explicar algo a otro estudiante de manera exitosa”.
Podríamos definir “contribución” como “responder una pregunta hecha a toda la clase por el profesor”.
Podríamos contar las veces que el estudiante contesta una pregunta del profesor apropiadamente y la mayoría del resto de la clase entiende.

El problema es que ahora estamos midiendo las veces que un estudiante “contesta una pregunta de profesor apropiadamente y la mayoría del resto de la clase entiende” y estos no es necesariamente lo mismo que hacer una contribución útil a la clase.

Entonces hay dos riesgos cuando evaluamos una habilidad que es difícil de medir.

Podemos tomar algo que todos entendemos y redefinirlo para hacerlo medible pero al hacer esto estaría cambiando la esencia de lo que estamos tratando de medir.
Si algo es muy difícil de medir lo dejamos por fuera del examen – inclusive si es algo en una habilidad muy importante.

Al final, concluimos que solo medimos lo que es fácil de medir en vez de evaluar el desempeño que es lo que estamos tratando de mejorar.

TAREA

Escuche a sus compañeros tener una conversación (en su lengua materna L1) en la sala de profesores. ¿Qué porcentaje de su habla natural consiste en oraciones completas?

¿Qué porcentaje consiste en fragmentos de oraciones unidos por recursos de entonación y ums y ers? ¿Qué tan a menudo le enseña a sus estudiantes a hablar usando fragmentos de oraciones?

Otros problemas con la evaluación Problema 2: resultados versus proceso, el qué versus el por qué

Otro problema con el sistema semicientífico de medida cuantitativa es que este no recolecta información cuantitativa. La medición nos dirá si la planta creció pero no dirá por qué creció (o por qué no). Esto nos da la información sobre el resultado pero no nos dice nada sobre el proceso.

En el ejercicio número 4, la escritura de un texto mencionada arriba, tendremos una mejor idea de las habilidades del estudiante puesto que podemos ver un poco del proceso detrás de la producción. Por ejemplo, podemos mirar si el estudiante puso los (√) o (X) en el ensayo y después ver si y cómo se vieron reflejados en su propio ensayo.

Problema 3: estandarización y resultados inusuales

El tercer problema es que el fertilizador que se le dio a la planta debe ser siempre el mismo, de otra manera los resultados no se pueden comparar. Debemos quitar las variables para así evaluar el éxito de programa. Es difícil decir como esto se pueda aplicar a la enseñanza. En el colegio, todas las clases deberían ser dictadas de la misma manera de lo contrario no podríamos realmente comparar el progreso individual de cada estudiante. Este método de evaluación, por lo tanto, genera metodologías autoritarias a prueba de profesores.

El modelo científico es más interesante para tendencias generales y lo resultados extraños individuales se ignoran. Por ejemplo, imagine que en su evaluación de escucha todos sus estudiantes obtiene 90% pero su mejor estudiante obtiene solo 10%. Para nosotros los profesores sería ese resultado inusual, el que nos gustaría investigar.

TAREA

Elija un libro de texto – de pronto el que utiliza en sus clases – y seleccione al azar: tres ejercicios de escucha, tres de lectura y tres de habla. ¿Cuál es el propósito de cada uno de ellos? ¿Es acaso:
… evaluar gramática o vocabulario? ( Mr. Brown _____ el cine, etc.)
… evaluar el entendimiento de los estudiantes? (a través de preguntas de selección múltiple acerca de la información en el texto, ejercicios de llenar información faltante, etc.)
… enseñar a los estudiantes a leer/escuchar/hablar mejor? (¿Incluye consejos de cómo mejorar lectura o escucha, practicar las interrupciones? Etc.)
… enseñar a los estudiantes a estudiar? (¿enseña lenguaje de clase? ¿Ayuda al estudiante a encontrar respuestas a sus propias preguntas?).

What’s the difference between testing, teaching and evaluation?

What is testing?

Every time we ask students to answer a question to which we already know the answer, we are giving them a kind of test. Much of what we do in class is, in fact, testing students’ knowledge. Here are some examples.

* He goes to the cinema. They …?
* Find a word in the text that means “angry”.
* On the recording, where does John tell Susan he wants to visit?
* What is the main idea of paragraph three?
* Dictation: write down the following …
* That’s that part of the lesson finished. What do you think we’re going to do next? * Testing and teaching

Teaching and testing

Turning performance Into numbers

Testing has, traditionally, measured the results of student performance.

* We choose some representative samples of language.

* We measure whether a student can use these samples.

* We then try to quantify this by turning it into a mark or grade.

* We keep a record of these marks and use this to give an end assessment.

Over time, all testing theory (whether languages or shampoo development) has traditionally been based on a semi–scientific procedure, namely:

1. Measure the performance.
2. Do something to affect the performance.
3. Measure the performance again and compare the difference.

Applying this traditional testing procedure or model to language learners has meant that the language learner is treated as a kind of plant. We measure the plant, apply the new fertiliser, and then measure the plant again to see what effect the fertiliser has had. As language teachers, we apply a (PLACEMENT) test, teach, and then give an ACHIEVMENT TEST to see how much better the students are.

In other words, testing is generally concerned with ENUMERATION, that is, turning performance into numbers.

Testing activities and teaching activities

Teaching and testing go hand–in–hand. We often ask questions to check that the students have understood what we have said. Equally, we sometimes ask a question to find out whether we need to teach a point. We instinctively know why we ask a question: whether it is to teach or to test something.

Compare the following two exercises.

Exercise 1

Fill the gap with an appropriate form of the verb.

John __________ France every year since 1993. (visit)

John __________ France last year. (visit)

Exercise 2

In groups, discuss the differences between the two sentences.

John has visited France every year since 1993. John visited France last year.

Exercise 1 assumes that the students have some knowledge and asks them to prove it. It is clearly a testing activity. Note that if the students get the right answer, we don’t know why they wrote that answer. It may be a guess, or it might just sound right.

Exercise 2 asks the students a question about the language. In other words, it is asking them to formulate a rule they can use in other situations – a generalisable theory.

It is also trying to increase their awareness of how the language works. It is trying to help them learn: it is a teaching activity. On the other hand, some teachers would say that people don’t need to know why it is right, they just need to get it right.

Exercise 3

Composition: A Summer’s Day at the Beach (150 words)

Exercise 4

Read the following two compositions entitled “A Summer’s Day at the Beach”. Which do you prefer and why?

Underline all the words and ideas relating to summer. Underline all the words and ideas relating to the beach. Put a tick next to the parts you like in each essay.

Put a cross next to the parts you don’t like in each essay.

If all the paragraphs got accidentally jumbled up, could you put them back in the right order? What would help you do this? Discuss your ideas with another group.

Homework: write your own composition on the same theme (150 words).

Let’s compare two more exercises.

Using the same ideas as we outlined above, Exercise 3 is clearly a test: it wants the student to show us what he/she can do. Exercise 4, on the other hand, clearly tries to make the student more aware of what he/she is trying to do: it tries to increase awareness before giving the task. It tries to help the student to learn.

Teaching or testing?

Sometimes, though, teachers can get confused about whether they are teaching or testing.

We can think we are teaching when we are actually testing.

This is particularly true when we try to teach the four skills: reading, writing, speaking and listening. Here language teachers face a major problem. We don’t really know enough; that is, there are no clear rules about good listening, reading and other skills. All we have are some rather generalised ideas such as skimming and scanning, and these are not detailed enough to help us work out an effective and progressive teaching programme.

In other words, when faced with a skill that is difficult to teach, such as good listening, we normally answer this problem in one of two ways. Either we give the students lots of opportunities to show what they know so we can see if they’re improving. We ask them to read, write or listen to texts of increasing linguistic complexity and hope they keep the same general results or even improve; or we keep the same texts and increase the complexity of the questions.

This is a bit like a doctor saying I don’t know what caused your illness or why you’re getting better, but your temperature is going down. All we can do to teach the four skills is expose students to language and take their temperature via testing to see if they’re getting better.

Or we substitute the skill that is difficult to teach with one that is easy to teach. While the rules for skills are not very clear, we do have some very good rules for grammar and vocabulary, which makes them easier to teach (however, writing a grammar/ vocabulary test can be complex). So we sometimes believe we are teaching or testing a skill, when really we are practising or testing grammar or vocabulary. For example, many speaking tests are disguised grammar revision: they can become an oral test of grammar. They don’t test real speaking skills such as interrupting without causing offence at all.

Why is this? Because the semi–scientific plant model of testing which we looked at earlier has some major problems. The next part covers these problems.

Problems with testing

Problem 1: Skills into numbers

As we saw, testing is based on an idea from science: measure, make changes, measure again and compare.

One problem with the scientific model is that not everything can necessarily be measured in this way. There are some things we can easily test in this way, e.g. the present simple third person –s.

But other skills are more difficult to measure. How, for example, can we quantify a student’s ability to make useful contributions to the class?

* First, we would have to define “useful” and “contribution” in a way that we could measure them.

* We could define “useful” as “successfully explaining something to another student”.

* We could define “contribution” as “answering a question put to the whole class by the teacher”. * We could now count how many times a student successfully answered a teacher’s question and the majority of the rest of the class understood.

The problem with this is that we are now measuring how many times a student “successfully answered a teacher’s question and the majority of the rest of the class understood”. This is not necessarily the same thing as making a useful contribution to the class.

So there are two dangers when assessing skills that are difficult to measure.

* We may take something we all understand and re–define it to make it measurable; but, in doing this, we may change the very thing we are trying to measure.

* If something is too difficult to measure, we leave it out of the test – even if the skill is very important.

In the end, we arrive at a position where we are only measuring the easily–measurable, rather than assessing the performance we are trying to improve.

TASK

Listen to your colleagues having (L1) conversations in the staff room.
What percentage of their natural spoken language consists of full sentences?

What percentage consists of sentence fragments linked by intonational devices and ums and ers? How often do you teach students to speak in fragmented sentences?

Other problems with testing Problem 2: Results versus processes, what versus why

Another problem with this semi–scientific system of QUATITATIVE MEASUREMENT is that it does not record QUALITATIVE DATA. Measuring will tell us if the plant has grown, but not why (or why not). It gives us information about the results, but doesn’t tell us anything about the process.

In the exercise four, a composition described above, we would get a much better idea of the student’s abilities, because we could see some of the processes behind the work, e.g. we could look at where the student put the ticks and crosses in the essays, and then see if and how these were reflected in his/her own essay.

Problem 3: Standardisation and odd results

A third problem with the scientific model is that the fertiliser given to the plant must always be the same, or the results cannot be compared. We must remove the variables in order to assess the success of the programme. It is difficult to see how this can work in teaching. In schools, all the teaching would have to be the same, or we couldn’t really compare the progress of individual students. This model of testing therefore leads to rather authoritarian teacher–proof methodologies.

The scientific model is also more interested in general trends, and strange individual results are often ignored. For example, imagine that in a listening test all your students get 90%, but your best student only gets 10%. For us as teachers, it is that one odd result that we would want to investigate.

TASK

Choose a coursebook – perhaps the one used in your school – and select at random: three listening exercises, three reading exercises and three speaking exercises. What is the purpose of each exercise? Is it:

… testing grammar or vocabulary? (e.g. Mr Brown ______the cinema; etc.)
… testing the student’s understanding? (e.g. via multiple–choice questions about information in the text; information gaps; etc.)

… teaching the student to read/listen/speak better? (e.g. Does it include advice about how to improve reading or listening, practising interrupting, etc.)

… teaching the student to study? (e.g. Does it teach classroom language? Does it help the student to find answers to their own questions?, etc.)