Keynote Speaker: Eamonn Kelly, Ph.D. - George Mason University (VA)
RTI Leadership Forum
December 8, 2010
Thank you very kindly. I will try to be brief and not curdle your soup. The talk is about, it’s at the level of research methods in terms of the logics and argumentations that apply at each one, and I think it’s important that the assumptions of different methodological approaches be understood because the people in the different methodological schools move in different, as it were, commissive spaces. They have different commitments to values, what is evidence, what is practice, how do you change it? They vary.
I’m going to, because it’s a luncheon talk I want to tell you a couple of stories. I had a grant from the National Science Foundation in the early 2000s to write the handbook on design research methods in education which came out in 2008, and it gave me the opportunity to interview some fantastic people, one of them being Everett Rogers of diffusion of innovations fame, Don Saari who’s a research mathematician at Irvine, and also Steve Toulmin, the philosopher. It was a huge education for me. So I want to go through the story that I heard. I asked Everett Rogers if you’re familiar with diffusion of innovations, I asked him Why is it that when we experimentalists run an experiment and get a result, it just doesn’t work when we bring it out into implementation?, and he said let me tell you a story. Now Everett Rogers spent a career looking at diffusion of innovations, and started his work looking at the applications of fertilizer for corn. He claims this was a true story, so you have got a group of researchers at a chemical company. They do a split plot, factorial randomized Greco Latin square design, and were able to show that the fertilizer actually works. The corn that gets the fertilizer is better than the other corn. They send out their sales team. They’re surprised actually very few people are interested. (laughter) But some people take it on. They wait a year. At the end of the year they expect huge sales. They get no sales. In fact, they find a dip in the sales of their other products. This is quite upsetting to them. They do a survey of the farmers and find out that it didn’t help the corn at all. It burned the corn. The corn was worse. This shocked them. They thought we’ve done the experiment wrong because you know maybe the P values were off. They re-ran their experiment. It came out the exact same way. Quote, unquote, it worked. So they were bewildered. They hired what would have been in the 1950s a field anthropologist to go out and find out what happened. And the farmers said that it just burned the crop, it was terrible, they wouldn’t use it again. Next slide. Here was the report he wrote back. (sustained laughter) Just hold it there. This reflects Golda Meir’s comment about Moses. She was once asked what she thought of Moses, Golda Meir. And she said, Mixed feelings about him…he spent 40 years wandering in the desert because he wouldn’t stop and ask for directions. (laughter) And he managed to find the only property in the area without oil. (loud laughter) So what the company did was they decided to pre-dilute the fertilizer and get around the problem of the men doing it. This required them by the way to have to change the size of the canisters cause you had to add water and that cost extra expense to put on the back of the tractor, but they decided this would solve the problem and once the sales took off they’d be in good shape. So what they did was they sent this out, they waited again. At the end there were no sales, and again the products that they had been selling were also damaged in lower sales. This time they sent the field anthropologist straight out and said What’s going on? And the field anthropologist said I’m not really sure, but you know what I’m going to do? I’m going to watch how they do it. So he’s watching how the process goes on. And then, this is the second field report, and this is what he wrote back. (laughter) See the problem? Right. So the assumption made back in the experiment that it’s a certain dosage, they thought they’d fixed that problem with the dilution, but if you have no speedometer, you’re going to go at the speed you’re comfortable with. And if you go too fast, nothing happens. If you go at the right speed, it works. If you go too slow, it burns the crop. So they had thought, thinking about retrofitting the tractors with speedometers and the story ends that when they finally got it all right, the local co-op decided there was no future in corn and went to Belgian endives. (laughter)
The second story’s from Don Saari, a research mathematician. He wrote up the story I’m going to tell you in one of the chapters in this handbook for design research methods. I said to him the same question. I said, Why is it when you get this finding with a P less than .05 in an effect size it doesn’t seem to work particularly in what Toulmin would call the clinical sciences, like education or public health? And he said well you’ve got to think about it as an economist, which he is, think about it in terms of games. The exchanges that people do. And he said think about a simple game in which I have $100, that’s the game. I have $100. And I’m willing to give you part of that $100 if I like the split that you suggest. If I like the split, I’ll give you that amount of money. If I don’t like the split, I won’t give you anything. I’ll keep my $100. Right? And this game has been tried out around the world, and there are cultural differences. There’s some places that say give me all the money. That doesn’t work, give me $99. That doesn’t work. There’s some places that say I’ll be happy with $2. Turns out in the U.S. people want to go to 50/50 for some reason. This has been studied in economics, and it turns out that depending upon how the orientation is, you can end up with the same figure but with a very different emotion. So I said, Fine, how does that apply in education? He said, Okay, think about a teacher as somebody who has the $100 of knowledge, of expertise, and a student comes to them and that game is played. So if the student says Look, I’m just going to sit here. You do all the work and give me an A. That’s expecting everything and nothing in return. If the student says, What an honor to be here. I will do whatever you like, give me extra work. I’ll be a GA for you. That changes the relationship between the student and the teacher. And according to Don Saari, there are these attractors. Negative attractors and positive attractors when the game is played, and if it starts out negatively, you can bargain to a place but everybody’s upset. You can end up in the same place positively, depending upon the how game is set. So in this case, as students are working with, teachers are working with students, they have got that question about what am I getting in return for this split, and if you’re doing something like introducing an innovation like RTI, you have got another game, as it were, going on between the administrators and the teachers. And the administrators and the teachers have to figure out this game. How is this split? You tell me what to do…I do all the work. According to Don Saari, one way to think about how these things play out is in terms of how gains get negotiated rather than in terms of causal factors.
Okay. Perception and the diffusion of innovations. So the work of Everett Rogers is how ideas, how products like iPhones, how new ideas get diffused over a long period of time in society. Next. So he says diffusion of the process, known as diffusion is a process, notice there’s no talk here about the innovation as a thing. We’re not talking about the iPhone. We’re talking about a process of innovation, right? In the process it involves communication. So innovation is done through people talking to other people. It’s kind of a word of mouth. Positive and negative. Remember the fertilizer. It goes through certain channels over time among members of a social system. So here is where the inventor’s irritation comes in. The inventor knows that the iPhone is good. They know that the Zune—have you heard about the Zune?—no. They know the Zune is good. Right? Nobody’s using it. So there are issues around diffusion of innovations that have nothing to do with the integral value of the product. Beta was better than VHS. People didn’t use it. For an awful long time, Mac was better than PC. People didn’t use it. There’s an awful lot of factors going on outside of the quality of the thing. P less than .05 you’ve got an effect size. Next.
Notice that adoption is a multi-year process. Kindergarten in this country took guess how many years for kindergarten to scale? Three years? Five years? Eight years? Fifty years. Fifty years for it to scale. So when you’re thinking about a 10-year window as to how things are doing, that’s probably too short. Scaling of practices takes a very, very long time. It’s a multi-stage process and in that process it’s a human process where people are judging the value of it. They’re deciding does this make sense for me or not, and if it doesn’t say it makes sense, the adoption will not happen in spite of the irritation of the inventor that it’s actually good for them. It’s a process of information persuasion and finally decision making. Next.
Successful innovation, it has to be perceived to have a relative advantage. Notice it has to be perceived to have a relative advantage. The fact that you know that it has a relative advantage is beside the point. Back to the fertilizer. The fact that you know P less than .05…it has to be seen by the person on the ground as having an advantage. Next.
So the questions according to Everett Rogers are, Is the innovation perceived as better than the idea, product, or technique it hopes to supersede? Notice that the people who are doing what they’re doing are not working in a vacuum. They’re not sitting there waiting for your innovation. They have a practice. They have an approach. They’ve been doing it for years. They understand the incentives. They have got all of that in place, and you’re asking them to put all that aside. All that cost. And to take on the innovation. They’re going to do it slowly. Next.
There is a positive relationship between observability and the uptake of an innovation. So you’ll see an awful lot of advertising of things so that you can see it. A lot of the RTI work initially is going to be done in contexts in which nobody can see the work being done. What you’re getting is reports that it worked but you can’t actually see a thing work, so the observability is low and consequently the chances of adoption are ….wrong key.
Trialability. Anytime you go to Starbucks they try trialability on you. They give you tiny little cups. They give you a little bit of what they call a cranberry bliss bar, right? And this will work, but it will work if a number of factors are in place and one is compatibility. The studies show that if the innovation is too far removed from the practices, the values, and the approaches that you currently have, no matter how good it looks to the inventor, it probably won’t be adopted.
Another factor is complexity and this is a really important one for RTI. If the person conceives that what they’re being asked to do is more complex than what they’ve been doing, it is negatively related to adoption. The process of complexity is one that is sincerely problematic here I think. So the question is is this practice hard to understand and difficult to implement? Notice that the expectation’s quite high. It has to apply to all students. I take this from one of the 2010 publications: data-based decision making is the essence of good RTI practice. Screening tools must be reliable, valid, and demonstrate diagnostic accuracy—no small feat. Benchmarks somehow must, and cut scores, must be set. Progress monitoring notice may include, excuse me for having to read it, data from tests of cognition, language, perception, and social skills, and when you get around to “changing instructional intensity” this could involve a number of factors. Changes in instructional time, level, frequency, group size or could involve outside help. That’s quite an awful lot to take on in terms of trying to be able to work with your practice.
If you’ve got a psychometric background here, if you look at the number of variables that could be pertinent just on the assessment side, in many cases you probably have more attributes that you’re measuring than you have items that are measuring them. This causes a really big problem. And even if you take all that complexity and reflect it on to a single line as in Item Response Theory, you’re not really addressing the problem because if the person has a complex profile you’re not going to be able to see it if you just reduce everything to IRT theta.
To the extent that, as I was listening it sounds that RTI might be a lot like good formative assessment. You do formative assessment, you think about it and you change, you respond. This is a quotation from Lorrie Shepard, dean of college of education in Colorado. She was responding to a special issue on formative research. She said, Formative assessment is of little use if teachers do not know what to do when students are shown to have a problem. As she said, formative assessment does not form instruction. So you’re left with this problem of knowing that there’s a problem, but not, being frustrated not knowing what to do next with it. She then says that validity research about this problem, about how formative assessment can be used to improve student learning, must be embedded in rich curriculum and respond to learning research. So what she’s saying is that the problem of how to respond is in a rich context with particular students with particular curriculum at particular times. It’s not something that you can answer offline. It arises differently each time. That’s the reference there, if you want to look it up. Educational Measurement Issues and Practice. Very good set of papers.
So in other words, RTI as currently conceived may present teachers and their support staff with the demands of design-based research. This is kind of Goal 1 research if you’re familiar with the IES model. You’re asking somebody to think about getting information, finding there’s a deficit, and then creating some kind of a prototype response, trying out the prototype response, and seeing if it’s working. These are very short cycles. And if it’s not working, what are the principles that you fall back upon to decide how to design it differently? Cause as you go through these levels of intensity, if you don’t have learning theory, psychological theories of testing, if you don’t have something to inform, you’re going to be doing what they do at Irish parties which is to sing it one more time with feeling (laughter).
Notice that the design based research is in addition to the expectations that you should be like an experimental researcher. Experimental researchers are a different community of people. They have different incentives and they are motivated by different things. And they have graduate students. They have psychometricians. They have assessment people. They’re curriculum experts. They can take their time. They can go back. They can look at the videotapes. They can go round and round—and they can get this hothouse to work very, very well. It’s a very high demand to expect that level of expertise to be available to somebody in practice all the time. And by the way, notice that the standards for psychometrics here are quite high. It’s really a tall order to give a small number of testlets, as ETS would call them, and get a lot of “reliable and valid information.” Easy to say. Very, very difficult to deliver on.
In fact, if you look at some of the work that has been done by diagnostic psychometrics, Tatsuo Otsu for example in his group down in Georgia, it’s a really, really hard problem because what you find is, if I can just appeal to mathematicians for a second, if you have 20 socks in a…sorry. It’s 50 socks say in a drawer, and you take out 20, right?, there’s 50, combination 20, anybody remember that? That’s a huge number of ways of taking 20 socks out of 50. Imagine if 20 is the cut score. And the 50 items measure all kinds of different things. You can get a huge number of ways to get a score of 20 and they’re all different. So the teacher is left with the 20 but can’t unpack it back to what the real problem is.
So for long-term success RTI must consider the changes in roles, resources, support, expertise, demands and costs that relate to each player in the implementing environment. Not in the original experimental context, in the context of implementation.
(laughs) Leave you with this one. Here’s my parting comment. Some of you remember about causation. So in the physical world, causation is a relentless imperative. So if I drop this iPhone it will fall at a certain speed and break. However, the point here—remember the early slide was causation, and the second one was agency—so when you’re trying to get something to change, it’s not like pushing something off a cliff. You’ve got to convince people, the agents to decide that it’s worth their while to do all this extra work with what incentives in order to get a gain that you value but they may not value. So there’s a lot of persuasion that has to go on, information getting that has to go on, changes in reward structure, and all that has to happen at the same time by the people who are setting the policy are thinking of moving from corn to Belgian endives. Thank you. (applause) I think we’re done. I don’t know if there’s any time for questions. If not, thank you very kindly.Back To Top