Animal Advocates Watchdog

World renown dog behaviourist Jean Donaldson on temperament testing

There are, at present, no dog-assessment procedures that are strong on the two critical test-evaluation yardsticks of reliability and validity. Test-retest reliability is the achieving of a replicable result in multiple administrations of the test over time. If I test a dog today and again in a month, are the results the same? Inter-tester reliability refers to the achieving of similar results with different testers. If three people conduct the same test on the dog, do they all get the same results?

Validity is the test's ability to predict behaviour in the real world. Given the notoriously weak track records of tests in these areas, it's safe to say these consultants, and others who read a lot into behaviour evaluations and temperament tests are on shaky ground and ought to have been more circumspect. They may even be using procedures that haven't undergone reliability or validity testing at all!

You raise a behaviour-versus-personality question that has interesting nature-nurture shades, and I will defer that discussion for a future column. The other, more practical, question is that of prognosis assessment – i.e., how can one tell whether a dog with an aggression problem is a good candidate for behaviour modification?

With reliability and validity problems plaguing currently available behavior tests, the other remaining avenue for prognosis information is history taking. History is usually obtained by client interview. The value of history is that it gets at both context and trend. There's a saying that goes, "Behaviour predicts behaviour." What a dog does today in a certain context is the best predictor of what he will do in that same context tomorrow. This is thought to be part of the reliability and validity obstacles in behaviour evaluations.

Does a tester represent all people to a dog? Do the test items adequately simulate real-life contexts? In the case of history, these bases are better covered, and if trend is added to that, predictions became firmer. What a dog did in the last three weeks, three months or; three years in a certain context is the best predictor of what he will do in that same context tomorrow.

The downside history taking is reporting error. We have all heard the stories of enacted events in law-school auditoriums and the wildly conflicting eyewitness accounts that ensue from the observing students, all given with great self–assurance. In the case of reporting dog-bite incidents, this notoriously poor recall of details is potentially compounded by the strong emotions involved and any vested interest an interview subject might have in subconsciously (or consciously) inflating or deflating severity. These factors must always be borne in mind when taking history.

Prognosis assessments should incorporate thorough histories and, if necessary to complement or confirm, direct observations in the real contexts in which the problem occurs. This requires greater legwork than a typical temperament test, but avoids those lethal testing problems. In the case of acquired bite inhibition (a vital prognostic indicator), history is the only means one can utilize, as deliberately orchestrating a bite is ethically too difficult to justify.

http://www.calgaryhumane.ca/animal_behaviour_dog_aggression_jd1.asp

A good temperament test is one that would do two things: provide consistent results regardless of who's testing as well as results consistent across time (a week later or after a few training sessions the dog's results should be the same; if not the test is not good or else "temperament" will have to be redefined)

Predicts behavior in the real world. These two yardsticks are well known in the science literature (the testing of humans on a dizzying array of parameters has been examined for many, many decades) and are called reliability (inter-tester reliability and test-re-test reliability) and validity. There are existing
experimental procedures to evaluate whether any given test does well on either or both of these measures.

Interestingly, the animal sheltering world has not used either of them to decide whether tests used on shelter animals are any good. The prevailing model has been to come up with some tests, support them with logical argument and then make the test available to other shelters. The absence of data collection and objective analysis would never occur in any other field.

So, there are not tests out there that have been formally tested and the tests on the tests peer-reviewed, the best modality I know to get beyond the realm of my opinion/your opinion and build real knowledge. One exception are certain sub-tests on a behavior evaluation that Emily Weiss came up with at U of Wichita to predict whether dogs from shelters might make good service dogs. She has presented some evidence but we need more, a lot more. In '99 we had a nationally renowned shelter expert give a seminar at SF/SPCA during which she tested a bunch of our shelter dogs.

The sample size was too small to publish but consisted, in validity follow-up, of 5 out of 5 false positives and 1 out of 3 false negatives (5 dogs were said to be unadoptable and were adopted out and, at the two-year post adoption mark were still in their homes without incidents of aggression and with no reported significant behavior issues - chewing, etc. excepted - and 3 labeled adoptable went out and one came back within a couple of weeks with dog-dog problems and a redirected bite). It sure got us thinking about the existing dogma out there.

I would urge your institution to come up with formal criteria for what you consider an adoptable dog, that all technical and policy-making staff sign off on, and then a test that is as objective as possible, i.e. based on observations rather than interpretations of behavior, to get at these criteria, so that no one is stuck in the subjective-judgment role. Not only does this hamper data collection but puts whoever must "play god" at risk for burnout.

http://www.bestfriends.org/archives/forums/dogmanners.html

Share