27
Conditional Reasoning in Context: A Dual-Source Model of Probabilistic Inference Karl Christoph Klauer, Sieghard Beller, and Mandy Hu ¨tter Albert–Ludwigs–Universita ¨t Freiburg A dual-source model of probabilistic conditional inference is proposed. According to the model, inferences are based on 2 sources of evidence: logical form and prior knowledge. Logical form is a decontextualized source of evidence, whereas prior knowledge is activated by the contents of the conditional rule. In Experiments 1 to 3, manipulations of perceived sufficiency and necessity mapped on the parameters quantifying prior knowledge. Emphasizing rule validity increased the weight given to form-based evidence relative to knowledge-based evidence (Experiment 1). Manipulating rule form (only–if vs. if–then) had a focused effect on the parameters quantifying form-based evidence (Experiment 3). The model also provides a parsimonious description of data from the so-called negations paradigm and adequately accounts for polarity bias in that paradigm (Experiment 4). Relationships to alternative conceptualizations of conditional inference are discussed. Keywords: conditional reasoning, probabilistic reasoning, dual-process model Recent years have seen an increased interest in the role of prior knowledge in conditional reasoning. Such work is char- acterized by a couple of features: Prior knowledge is activated either through the use of contents with which participants have prior experience (e.g., Beller, 2008; Beller & Spada, 2003; Cummins, Lubart, Alksnis, & Rist, 1991; Thompson, 1994) or by providing explicit prior information about the relationships between the antecedent and consequent of a conditional rule, often in the form of bivariate frequency information (e.g., Evans, Handley, & Over, 2003; Oaksford, Chater, & Larkin, 2000; Oberauer & Wilhelm, 2003). In contrast, earlier work frequently relied on so-called abstract materials, for which little prior information is presumably available. Simultaneously, in- structions in earlier work tended to stress the notion of logical necessity, according to which a conclusion can be drawn only if it logically follows from the given premises, whereas the recent work more frequently relies on a graded response format in which the acceptability, plausibility, or probability of the con- clusion given the premises has to be assessed (e.g., Liu, Lo, & Wu, 1996; Oaksford et al., 2000). We refer to the recent line of research as research on probabilistic conditional inference. In this article, we propose a dual-source model of probabilistic conditional inference. According to the model, inferences are based on two sources of evidence: logical form and prior knowledge. Logical form is a decontextualized source of evi- dence, whereas prior knowledge is activated by the contents of the conditional rule. The dual-source hypothesis is contrasted with the view that probabilistic conditional reasoning draws primarily on prior knowledge. In this view, exemplified by Oaksford et al.’s (2000) probabilistic model of conditional inference, the role of the conditional rule is to alter the knowl- edge base from which inferences are derived. In the introduc- tion, we review existing evidence for two qualitatively distinct modes of reasoning, one based on logical form, the other based on prior knowledge. This is followed by a review of research on the role of logical form in probabilistic conditional inference. The research suggests that both prior knowledge and logical form play a role in such inferences. This hypothesis is then formally specified by means of the dual-source model. In four experiments, the dual-source model is evaluated in terms of its ability to fit the data, in terms of whether the effects of exper- imental manipulations targeted at specific model parameters indeed affect these parameters as expected, and in terms of whether the model adequately reproduces critical data patterns observed in probabilistic conditional inference. In each exper- iment, the performance of the dual-source model is compared with that of Oaksford et al.’s one-source model. Four conditional inferences are typically studied for a condi- tional rule of the form “if p then q”: Modus ponens (MP): Given the rule and “p,” it follows that q.” Modus tollens (MT): Given the rule and “not-q,” it follows that “not-p.” Affirmation of the consequent (AC): Given the rule and “q,” it follows that “p.” Denial of the antecedent (DA): Given the rule and “not-p,” it follows that “not-q.” Karl Christoph Klauer, Sieghard Beller, and Mandy Hu ¨tter, Institut fu ¨r Psychologie, Albert–Ludwigs–Universita ¨t Freiburg, Freiburg, Germany. The research reported in this article was supported by Grant Kl 614/31-1 from the Deutsche Forschungsgemeinschaft to Karl Christoph Klauer. Correspondence concerning this article should be addressed to Karl Christoph Klauer, Institut fu ¨r Psychologie, Sozialpsychologie und Metho- denlehre, Albert–Ludwigs–Universita ¨t Freiburg, D-79085 Freiburg, Ger- many. E-mail: [email protected] CORRECTED MARCH 18, 2010; SEE LAST PAGE Journal of Experimental Psychology: © 2010 American Psychological Association Learning, Memory, and Cognition 2010, Vol. 36, No. 2, 298 –323 0278-7393/10/$12.00 DOI: 10.1037/a0018705 298

Conditional Reasoning

  • Upload
    nesuma

  • View
    20

  • Download
    2

Embed Size (px)

DESCRIPTION

psy

Citation preview

Page 1: Conditional Reasoning

Conditional Reasoning in Context:A Dual-Source Model of Probabilistic Inference

Karl Christoph Klauer, Sieghard Beller, and Mandy HutterAlbert–Ludwigs–Universitat Freiburg

A dual-source model of probabilistic conditional inference is proposed. According to the model,inferences are based on 2 sources of evidence: logical form and prior knowledge. Logical form is adecontextualized source of evidence, whereas prior knowledge is activated by the contents of theconditional rule. In Experiments 1 to 3, manipulations of perceived sufficiency and necessity mapped onthe parameters quantifying prior knowledge. Emphasizing rule validity increased the weight given toform-based evidence relative to knowledge-based evidence (Experiment 1). Manipulating rule form(only–if vs. if–then) had a focused effect on the parameters quantifying form-based evidence (Experiment3). The model also provides a parsimonious description of data from the so-called negations paradigmand adequately accounts for polarity bias in that paradigm (Experiment 4). Relationships to alternativeconceptualizations of conditional inference are discussed.

Keywords: conditional reasoning, probabilistic reasoning, dual-process model

Recent years have seen an increased interest in the role ofprior knowledge in conditional reasoning. Such work is char-acterized by a couple of features: Prior knowledge is activatedeither through the use of contents with which participants haveprior experience (e.g., Beller, 2008; Beller & Spada, 2003;Cummins, Lubart, Alksnis, & Rist, 1991; Thompson, 1994) orby providing explicit prior information about the relationshipsbetween the antecedent and consequent of a conditional rule,often in the form of bivariate frequency information (e.g.,Evans, Handley, & Over, 2003; Oaksford, Chater, & Larkin,2000; Oberauer & Wilhelm, 2003). In contrast, earlier workfrequently relied on so-called abstract materials, for which littleprior information is presumably available. Simultaneously, in-structions in earlier work tended to stress the notion of logicalnecessity, according to which a conclusion can be drawn only ifit logically follows from the given premises, whereas the recentwork more frequently relies on a graded response format inwhich the acceptability, plausibility, or probability of the con-clusion given the premises has to be assessed (e.g., Liu, Lo, &Wu, 1996; Oaksford et al., 2000). We refer to the recent line ofresearch as research on probabilistic conditional inference.

In this article, we propose a dual-source model of probabilisticconditional inference. According to the model, inferences arebased on two sources of evidence: logical form and priorknowledge. Logical form is a decontextualized source of evi-dence, whereas prior knowledge is activated by the contents of

the conditional rule. The dual-source hypothesis is contrastedwith the view that probabilistic conditional reasoning drawsprimarily on prior knowledge. In this view, exemplified byOaksford et al.’s (2000) probabilistic model of conditionalinference, the role of the conditional rule is to alter the knowl-edge base from which inferences are derived. In the introduc-tion, we review existing evidence for two qualitatively distinctmodes of reasoning, one based on logical form, the other basedon prior knowledge. This is followed by a review of research onthe role of logical form in probabilistic conditional inference.The research suggests that both prior knowledge and logicalform play a role in such inferences. This hypothesis is thenformally specified by means of the dual-source model. In fourexperiments, the dual-source model is evaluated in terms of itsability to fit the data, in terms of whether the effects of exper-imental manipulations targeted at specific model parametersindeed affect these parameters as expected, and in terms ofwhether the model adequately reproduces critical data patternsobserved in probabilistic conditional inference. In each exper-iment, the performance of the dual-source model is comparedwith that of Oaksford et al.’s one-source model.

Four conditional inferences are typically studied for a condi-tional rule of the form “if p then q”:

Modus ponens (MP): Given the rule and “p,” it follows that“q.”

Modus tollens (MT): Given the rule and “not-q,” it followsthat “not-p.”

Affirmation of the consequent (AC): Given the rule and “q,”it follows that “p.”

Denial of the antecedent (DA): Given the rule and “not-p,” itfollows that “not-q.”

Karl Christoph Klauer, Sieghard Beller, and Mandy Hutter, Institut furPsychologie, Albert–Ludwigs–Universitat Freiburg, Freiburg, Germany.

The research reported in this article was supported by Grant Kl 614/31-1from the Deutsche Forschungsgemeinschaft to Karl Christoph Klauer.

Correspondence concerning this article should be addressed to KarlChristoph Klauer, Institut fur Psychologie, Sozialpsychologie und Metho-denlehre, Albert–Ludwigs–Universitat Freiburg, D-79085 Freiburg, Ger-many. E-mail: [email protected]

CORRECTED MARCH 18, 2010; SEE LAST PAGEJournal of Experimental Psychology: © 2010 American Psychological AssociationLearning, Memory, and Cognition2010, Vol. 36, No. 2, 298–323

0278-7393/10/$12.00 DOI: 10.1037/a0018705

298

Page 2: Conditional Reasoning

Each of these consists of the major premise (i.e., the rule), theminor premise (e.g., p for MP), and a conclusion (e.g., q for MP).Under a traditional interpretation of the conditional rule ( p issufficient, but not necessary for q; Evans & Over, 2004), MP andMT are logically valid inferences, whereas AC and DA are notlogically valid.

Let us briefly review results obtained with abstract or arbitraryrule contents and instructions stressing logical necessity. Endorse-ment rates are typically close to 100% for MP, whereas the AC,DA, and MT inference rates vary from 23%, 17%, and 39% to89%, 82%, and 91%, respectively (Schroyens, Schaeken, &d’Ydewalle, 2001). Across studies, MP is accepted significantlymore frequently than MT and than AC; MT and AC are acceptedsignificantly more frequently than DA; the difference between MPand MT is significantly and substantially larger than that betweenAC and DA (Schroyens et al., 2001); and the difference betweenAC and DA is often not significant in individual studies (Evans,1993; O’Brien, Dias, & Roazzi, 1998). Taken together, acceptancerates tend to be ordered as MP � MT � AC � DA. Some of thevariability in the inference rates reflects the fact that participantssometimes adopt a biconditional rather than conditional interpre-tation of “if p then q.” The biconditional interpretation ( p issufficient and necessary for q) justifies acceptance of all fourinferences. But many procedural variations seem to play a role inshaping the profile of acceptance rates (Evans & Over, 2004,Chapter 3; Schroyens et al., 2001). As pointed out by Schroyens etal. (2001), these results are consistent with a (revised version ofthe) mental model theory (Johnson-Laird, Byrne, & Schaeken,1992) and the theory of mental rules (e.g., Rips, 1994).

Studies using contents for which prior knowledge is availablealso address these inferences and typically require ratings of plau-sibility, probability, or confidence in the truth of the conclusion. Insuch studies, there is even more variability in the profiles of ratingsover the four inferences, but two major variables account for muchof it: perceived sufficiency of p for q and perceived necessity of pfor q. Perceived sufficiency and necessity have been assessed indifferent ways; one possibility is to have participants rate sen-tences such as “It is necessary for p to happen in order for q tohappen” (perceived necessity) and “p happening is enough toensure that q will happen” (perceived sufficiency; Thompson,1994, p. 745). Thompson (1994) systematically chose contentsdiffering in perceived sufficiency and necessity from differentdomains dealing with causal relationships, permissions, obliga-tions, and definitions. Across these domains, she consistentlyfound strong effects of perceived sufficiency on the acceptance ofMP and MT and strong effects of perceived necessity on DA andAC: Perceived sufficiency and MP and MT acceptability aremonotonically related as are perceived necessity and DA and ACacceptability. Figure 1 (middle panel) illustrates the typical resultswith data compiled from experiments by Liu (2003; Experiments1a and 2), which we use as an illustrative example throughout theintroduction. Perceived sufficiency and necessity were varied inthree steps: high (H), medium (M), and low (L), creating six rulesHL, ML, LL, LH, LM, and LL, with the first letter referring todegree of sufficiency and the second letter to degree of necessity.Contents with high sufficiency of p for q (HL) received the highestMP and MT ratings, followed by contents with medium suffi-ciency (ML), whereas contents with high necessity of p for q (LH)

received the highest AC and DA ratings, followed by contents withmedium necessity (LM).

A distinction related to perceived necessity and sufficiency isthat between alternative antecedents and disabling conditions. Analternative antecedent is an event distinct from p which is suffi-cient for q; for example, an alternative antecedent for the rule “Ifa stone is thrown at a window, it will break” is to fire a gun at awindow. A disabling condition is a condition that prevents q fromhappening in the presence of p (e.g., the window is made ofPlexiglas). Alternative antecedents thus undermine perceived ne-cessity, and disabling conditions undermine perceived sufficiency.Perceived necessity and availability of disablers as well as per-ceived sufficiency and availability of alternatives are relativelyhighly correlated (Verschueren, Schaeken, & d’Ydewalle, 2005a),and they engender similar effects on the profiles of ratings over thefour conditional inferences (e.g., Cummins et al., 1991; De Neys,Schaeken, & d’Ydewalle, 2003).

One or Two Modes of Reasoning?

One issue raised by the two different lines of research is whetherthey reflect the operation of two distinct modes of reasoning. Thefirst mode capitalizes on prior knowledge about the particular pand q related by the rule. The second mode capitalizes on thelogical form of the proposed inference, irrespective of content.

Consider the first mode. When confronted with premises and aproposed conclusion, reasoners might sample their long-termmemory, weighing memory traces of situations in which the con-clusion and the minor premise were simultaneously fulfilled (e.g.,cases with p and q for MP) against those in which the conclusionwas not fulfilled, but the minor premise was in force (i.e., caseswith p and not-q), retrieving counterexamples in the form ofalternatives (i.e., factors other than p that entail q) and/or disablers(i.e., factors that prevent q from happening in the presence of p).These bits of information are then integrated and condensed into arating of plausibility, probability, or confidence in the truth of theconclusion as the task may require (e.g., Verschueren et al.,2005a).

In contrast, the second mode might reflect an effort to judge thelogical validity of the proposed inference through an algorithmsuch as specified in mental model theory (Johnson-Laird et al.,1992; Schroyens et al., 2001) or the theory of mental rules (e.g.,Rips, 1994)—an algorithm that is endowed with the competence toprovide decontextualized, logically correct assessments of the in-ferences but is fallible due to capacity constraints, misinterpreta-tions, intrusions of strong knowledge-based associations, and ahost of other factors.

The first mode is what is tapped in studies using probabilisticinstructions and response formats; the second mode is morestrongly implicated in studies using instructions emphasizing log-ical necessity and abstract content. Note that both modes are likelyto involve multiple, dissociable processes when scrutinized moreclosely (see, e.g., Verschueren et al., 2005a, for a dual-processcharacterization of the first mode). For lack of better terms, let usrefer to the first mode as knowledge-based and to the second modeas form-based (Beller & Spada, 2003).

A few studies have addressed this issue head on by contrast-ing instructions intended to elicit the knowledge-based modewith instructions intended to elicit the form-based mode while

299A DUAL-SOURCE MODEL

Page 3: Conditional Reasoning

keeping rule contents constant. Thus, Markovits and Handley(2005; see also Markovits & Thompson, 2008) compared (a) aninductive condition with probabilistic response format (“Howprobable is it that the conclusion follows?”) with (b) a deduc-tive condition with a binary response format (“Is it certain thatthe conclusion follows?”). Markovits and Handley noted thattheir results are consistent with a particular one-mode modelinvoking only the knowledge-based mode of reasoning. In thisthreshold model, an inference is accepted in the deductivecondition if and only if it receives a very high probability ratingin the inductive condition. In other words, in the threshold

model, subjective probability underlies both the responses inthe inductive condition as well as in the deductive condition,but the binary judgments in the deductive conditions are gen-erated by placing a high threshold on an underlying probabilityscale.

Such threshold models were explicitly tested by Rips (2001) andHeit and Rotello (2005, 2008; Rotello & Heit, 2009). Like Marko-vits and Handley (2005), these authors compared inductive anddeductive instructions using the same problems. Rips presentedinferences that were either logically valid or not. Orthogonally, theinferences were either plausible in the light of prior knowledge or

Pro

babi

lity

Rat

ing

Without Rule

1020

3040

5060

7080

9010

0

MP MT AC DA

With Rule

1020

3040

5060

7080

9010

0

MP MT AC DA

HLMLLLLHLMLL

Par

amet

er E

stim

ates

(in

Per

cent

)

Model Parameters

τ

τ

τ

τ

1020

3040

5060

7080

9010

0

MP MT AC DA

Figure 1. Reanalysis of data from Experiments 1a and 2 by Liu (2003). The left and middle panels show meanratings for problems without rule and problems with rule, respectively, as a function of content (HL, ML, LL,LH, LM, and LL) and inference (MP [modus ponens], MT modus [tollens], AC [affirmation of the consequent],and DA [denial of the antecedent]). The right panel shows mean parameter estimates (multiplied by 100) of theknowledge parameters for the dual-source model. � parameters are also shown in this panel with values markedas �. HL, ML, LL, LH, LM, and LL refer to high (H), medium (M), and low (L) degrees of perceived sufficiency(first letter in the pair) and perceived necessity (second letter in the pair).

300 KLAUER, BELLER, AND HUTTER

Page 4: Conditional Reasoning

implausible. When comparing deductively valid, but implausible,problems with deductively invalid, but plausible, problems a dou-ble dissociation emerged: Under the deductive instruction, deduc-tively valid, but implausible, problems were more frequently ac-cepted than deductively invalid, but plausible, problems, and viceversa under the inductive instruction. This reversal is not compat-ible with a threshold model in which responses under both kinds ofinstruction are generated from a single underlying probability scalewith different thresholds.

Heit and Rotello (2005, 2008; Rotello & Heit, 2009) providedfurther tests of the threshold model couched in a signal-detectionframework. For example, they found that signal-detection param-eter d� for the participants’ ability to discriminate between thelogically valid and invalid problems was larger under the deduc-tive instruction than under the inductive instruction, which is alsoinconsistent with a threshold model postulating only one mode ofreasoning. Heit and Rotello (2008; Rotello & Heit, 2009) arguedthat a two-dimensional signal-detection model that draws on twodifferent sources of evidence— consistency with backgroundknowledge and perceived deductive correctness—is capable ofaccounting for the observed dissociations between the inductiveand the deductive condition. These findings, although not dealingwith conditional reasoning in particular, strongly suggest thatreasoners avail of two modes of reasoning that draw on differentsources of evidence—background knowledge and logical correct-ness—and that reliance on one source versus the other dependsupon the mode of reasoning stressed by instructions (see alsoEvans & Over, 2004, Chapters 8 and 9 for a similar point of viewin the context of conditional reasoning).

Probabilistic Conditional Reasoning and Modeof Reasoning

The findings considered so far remain silent with regard to thepossibility of an involvement of the form-based mode of reasoningin probabilistic conditional reasoning. There is in fact relativelylittle evidence for a contribution of the form-based mode of rea-soning to probabilistic conditional reasoning. Strong effects ofvariables such as perceived necessity/sufficiency and availabilityof disablers/alternatives attest, however, that the knowledge-basedmode is involved (e.g., Thompson, 1994; Verschueren et al.,2005a; Verschueren, Schaeken, & d’Ydewalle, 2005b). In all ofthese studies, it is usually possible and even perfectly natural toassess perceived necessity/sufficiency and the availability of alter-natives/disablers without reference to a conditional rule.1 Simul-taneously, it is natural in knowledge-rich contexts to pose thequestions corresponding to the above conditional inferences withthe conditional rule left out. Consider the rule “If a stone is thrownat a window, then the window breaks.” “MP” without rule wouldsimply ask: “A stone is thrown at a window. What is the proba-bility that the window breaks?”

A few studies have presented the inferences without rule(George, 1995; Liu, 2003; Liu et al., 1996; Matarazzo & Baldas-sarre, 2008; see also Beller, 2008; Beller & Kuhnmunch, 2007, forsimilar experiments with deductive instructions) and found effectsof perceived sufficiency and necessity analogous to those observedwhen a conditional rule is stated. Consider, for example, resultsfrom the studies by Liu (2003) shown in Figure 1. The left panelof Figure 1 presents the probability ratings without conditional rule

as a function of perceived necessity and sufficiency (with labelssuch as HL already explained above). Like when a conditional ruleis present, perceived sufficiency of p for q (varied across contentsHL, ML, and LL) is monotonically related to what would be MPand MT ratings if the rule “if p then q” had also been presented,and perceived necessity (varied across contents LH, LM, and LL)to AC and DA ratings. This suggests that we may not need aconditional rule to account for the effects observed in probabilisticconditional reasoning. Without rule, there is, however, no logicalform on which the form-based mode of reasoning could operate.

This also raises a methodological issue: Studies that do notimplement a baseline condition without rule are inherently ambig-uous with regard to a possible causal role of the conditional rule.In the rare instances in which such a baseline was obtained, theeffects of the presence of a rule were generally found to be small(see also Beller, 2008). For example, in the studies by Liu (2003)most of the variance in ratings without rule and with rule isaccounted for by perceived necessity and sufficiency (compare leftand middle panels of Figure 1).

Another relevant manipulation is to manipulate how content ismapped on logical form. For example, Cummins (1995) andThompson (1994) manipulated clause order in the rule, comparinginferences with rules “If p then q” versus “If q then p,” with minorpremise and conclusion held constant. Formally, what are MP andMT with regard to the first rule, are AC and DA, respectively, forthe second rule, and vice versa. The effects of rule form were smallrelative to the dominant effects of perceived necessity and suffi-ciency (Thompson, 1994) and disabling conditions and alternativeantecedents (Cummins, 1995).

Finally, Stevenson and Over (2001) looked at MP and MTinferences and manipulated the expertise of a speaker positing themajor premise and thereby the perceived validity of the rule. Forexample, the rule might be “If Bill has typhoid, then he will makea quick recovery.” In one condition, the rule would be uttered bya professor of medicine, in another condition by a first-year med-ical student. Both MP and MT conclusions were assigned greatersubjective certainty when the major premise was asserted by theexpert than by the novice.

Taken together, there seem to be effects of rule presence, ruleform, and rule validity (manipulated via speaker expertise), even ifthey tend to be small relative to the effects of background knowl-edge itself.

How can the effects of logical form be explained? As discussedabove, reasoners can in principle access evidence from two sources:consistency with background knowledge and perceived deductivecorrectness. One possibility is therefore that reasoners draw onboth of these sources where possible (i.e., where there is relevantbackground knowledge and a conditional rule is present) and thattheir judgments then integrate evidence from both sources with

1 For example, considering p � “A stone is thrown at a window” andq � “The window breaks,” one might ask participants to assess, withoutmentioning a conditional rule, the perceived likelihood that p is sufficientfor q to occur (perceived sufficiency of p for q) and the perceived likeli-hood that p is required for q to occur (perceived necessity of p for q).Similarly, participants can be asked to generate alternative antecedents asfactors distinct from p that lead to q, and disabling conditions as conditionsthat prevent q from occurring in situations with p without reference to anyconditional rule whatsoever.

301A DUAL-SOURCE MODEL

Page 5: Conditional Reasoning

weights determined by instructions and other factors (Heit &Rotello, 2008; Rotello & Heit, 2009). In this view, both theknowledge-based and the form-based modes of reasoning areengaged, and they jointly determine the participants’ ratings evenin probabilistic conditional reasoning. Another possibility is firmlycouched in the knowledge-based mode of reasoning and exempli-fied by Oaksford et al.’s (2000) probabilistic model of conditionalinference. In their view, the effect of the conditional rule is to alterthe knowledge base from which probability ratings are derived.Specifically, a conditional rule serves to depress the perceivedprobability of cases with p and not-q below the levels that may bein force when no rule is stated (Oaksford & Chater, 2007, p. 164)as elaborated below.

The purpose of the present article is to explore the first possi-bility, which we refer to as the dual-source hypothesis (Beller &Spada, 2003). For this purpose, the dual-source hypothesis issharpened to the point where it is specified as a computationalmodel of probabilistic conditional reasoning with the same degreeof specification and elaboration as the well-established one-sourcemodel by Oaksford et al. (2000). We then present the results offour empirical studies evaluating the ability of the resulting dual-source model to account for major findings from probabilisticconditional reasoning and compare its performance with that ofOaksford et al.’s model. We focus on that model for the compar-ison because it is the only one that makes precise quantitativepredictions for our data. To foreshadow, both models performedreasonably well in accounting for our data, and we conclude thatthe dual-source model is thereby at a minimum established as aviable alternative to the dominant knowledge-based view of con-ditional reasoning (e.g., Geiger & Oberauer, 2007; Liu et al., 1996;Oaksford & Chater, 2007; Oaksford et al., 2000; Verschueren etal., 2005a) as exemplified here by Oaksford et al.’s model. Inseveral respects, however, the dual-source model outperformedOaksford et al.’s model.

Oaksford et al.’s (2000) Model of ProbabilisticConditional Reasoning

Oaksford et al. (2000) presented an impressive model of prob-abilistic conditional reasoning. It is a model for the subjectiveprobabilities of the conclusions in the above conditional infer-ences. The idea is that conclusion probability is the perceivedconditional probability of the conclusion given the premises. Inparticular, the probability ratings of conditional inferences MP,MT, AC, and DA are predicted by the formulae for the conditionalprobabilities of the conclusion given minor premise, in order,P(q�p), P(not-p�not-q), P( p�q), and P(not-q�not-p) for the “if p thenq” rule. These conditional probabilities can be expressed in termsof three parameters, a, b, and e. Parameter a is the perceivedprobability of p events, b is the perceived probability of q events,and e the conditional probability of not-q given p. Parameter e istermed the exceptions parameter because cases with p, but withoutq, are exceptions from the rule. The values of these parameterssummarize the reasoner’s knowledge about the particular content(e.g., stones and windows) that is used for the rule and premises.We refer to the parameters as a(C), b(C), and e(C) to highlight thatdifferent parameters are required for rules that differ in content C.Using elementary probability calculus, the relevant probabilitiesare as follows:

MP: P�q�p� � 1 � e�C�,

MT: P�not-p�not-q� �1 � b�C� � a�C�e�C�

1 � b�C�,

AC: P(q�p) �a�C��1 � e�C��

b�C�,

DA: P�not-q�not-p� �1 � b�C� � a�C�e�C�

1 � a�C�. (1)

These equations in themselves impose only mild conditions onthe probability ratings, namely that prior knowledge bearing on thedifferent conditional questions is consistent with a bivariate prob-ability distribution relating p and q of that content. As a conse-quence, observed ratings should obey the laws of the probabilitycalculus. For example, the probability rating for q given p asassessed in MP should be one minus the probability rating fornot-q given p assessed in the so-called converse inference MP� thatpresents the rule, minor premise p, and the negated conclusionnot-q. That is, to the extent to which q is accepted given p, not-qshould be rejected. There is evidence, from studies using abstractconditionals, suggesting that this consistency assumption is anoversimplification (Handley & Feeney, 2003, Experiment 3), but itis an integral part of any model couched in a probability metric,including the dual-source model presented below, and it seems toprovide at least a good first approximation of the data fromprobabilistic conditional reasoning (Oaksford et al., 2000).

Oaksford et al.’s (2000) model makes the additional, morerestrictive assumption that the exceptions parameter e(C) is small,although it may still vary from content to content, being derivedfrom the reasoner’s current state of knowledge. This directly leadsto high predicted ratings for MP, and it implies a positive contin-gency of p and q in the 2 2 contingency table crossing p andnot-p with q and not-q. The model has been fit to many data setsboth in the domain of abstract conditional reasoning and in thedomain of probabilistic conditional reasoning (Oaksford & Chater,2007). For these applications, different parameters a, b, and e areusually estimated for each rule.

The Dual-Source Model

The dual-source hypothesis (Beller & Spada, 2003) claims thatreasoners draw on two sources of evidence. One source is thesignal derived from relevant background knowledge. It encodesthe degree of consistency of the conclusion given the premiseswith background knowledge. The second source is perceived log-ical correctness of the proposed inference, derived from logicalform. As is evident from the studies of conditional reasoning withabstract content, perceived logical correctness is influenced bymany factors such as processing constraints and interpretation ofthe premises. As reviewed above, these studies suggest that theinferences are perceived as logically valid with acceptance ratesordered as MP � MT � AC � DA.

When one of these inferences is not accepted, the conclusion istypically seen as neither necessitated nor logically forbidden by thepremises (Evans, Newstead, & Byrne, 1993, Chapter 2). In studiesusing abstract content, the response “maybe” or “don’t know”might then be given to express the resulting uncertainty about theconclusion. In such cases, participants have little choice other than

302 KLAUER, BELLER, AND HUTTER

Page 6: Conditional Reasoning

to fall back on extralogical sources of information such as theirprior knowledge to assess the plausibility or probability of theconclusion. This leads one to expect that the impact of priorknowledge should increase going from the frequently accepted MPinference to the relatively rarely accepted DA inference. In otherwords, inasmuch as an attempt at logical reasoning plays a role andinasmuch as the results with abstract conditionals can be taken asa guideline for perceived logical correctness, the role of priorknowledge would be progressively reduced moving through theinferences in the order DA, AC, MT, MP.

This can again be illustrated using data from Liu (2003). InLiu’s experiments, conclusions were rated twice, once with therule present (e.g., MP: “If a substance is a diamond, then it is veryhard. Given that a substance is a diamond, how probable is it thatit is very hard?”) and once without the rule (e.g., MP: “Given thata substance is a diamond, how probable is it that it is very hard?”).The left two panels of Figure 1 show mean probability ratings inpercentages. Considering the left panel for problems without ruleand the middle panel for problems with rule, two patterns areprominent. First, as already noted, prior knowledge has a strongimpact (i.e., there are pronounced effects of perceived sufficiencyand necessity). Second, the effect of adding a rule appears to be acompression of ratings toward higher values for MP and MT butnot as much for AC and least for DA. Thus, the impact of priorknowledge was reduced for MP and MT but not as much for ACand least for DA. The compression of endorsements toward highvalues for MP and MT inferences was recently replicated byMatarazzo and Baldassarre (2008; see also George, 1995), who didnot consider AC and DA inferences. It suggests in the presentframework that perceived logical correctness served to push rat-ings for MP and MT toward higher levels.

Note also that within each inference, the probability ratings areordered the same over contents for problems without rule as forproblems with rule in most cases. This implies, for example, thatperceived sufficiency, P(q�p), as assessed in MP problems withoutrule, predicts ratings for MP problems with rule. This in turnsuggests that ratings with rule still integrate background knowl-edge over and above perceived logical correctness, even whenlogical form provides a strong decontextualized signal as in MP.

This pattern of data is consistent with a dual-source model inwhich participants integrate information from two sources: per-ceived consistency with background knowledge (knowledge-basedevidence) and perceived logical correctness (form-based evidence;see also Beller & Spada, 2003). The dual-source model assumesthat information from both sources is integrated as a weightedaverage with proportional weights given by for the form-basedmode of reasoning and 1 � for the knowledge-based mode ofreasoning. The factor varies between 0 and 1. It is assumed to bethe same for each inference and rule, but it may depend upon howmuch the instructions stress the rule versus the particular contentsand on similar context variables.

Knowledge-based evidence depends on the particular content Cto which the statements p and q refer. It is summarized by param-eters �(C,x). Parameter �(C,x) quantifies, on a probability scale, thesubjective certainty that the conclusion in inference x is warrantedby background knowledge about the particular p and q presented.In Liu’s (2003) experiments, prior knowledge was manipulated bythe contents C that differ in perceived sufficiency and necessity ofp for q, and it was assessed in the conditions without rule. The

parameter �(C,x) is the contribution of the knowledge-based modeof reasoning to observed ratings (with weight 1 � ).

As an example, consider the content C dealing with p � “Astone is thrown at a window” and q � “The window breaks.”There are four � parameters for this content, one for each inferenceunder study. For example, �(C,MP) is one’s subjective certaintythat a window at which a stone is thrown breaks, based on whatone knows about stones and windows in general and on average;�(C,MT) is one’s subjective certainty that a stone was not thrownat a window that is not broken, and so forth. Note that theseparameters refer to one’s knowledge independently of an “if p thenq” rule. The inferences MP, MT, and so forth are still labeled withreference to the “if p then q” rule, but this is merely a notationalconvenience because it allows us to use the same labels forproblems with and without rule. The inferences could alternativelybe labeled by minor premise and conclusion without reference toany rule whatsoever.

The form-based evidence is different for each inference, butbeing decontextualized does not depend upon rule content. It ismodeled by parameters �(x), with x being one of the inferences,MP, MT, DA, and AC. Parameter �(x) quantifies, on a probabilityscale, the subjective certainty that the inference x is warranted bythe logical form of the conditional argument. Taking the resultsfrom abstract conditional reasoning as a guideline, we expect theparameters � to be ordered as �(MP) � �(MT) � �(AC) � �(DA).The remaining uncertainty, 1 � �(x), quantifies the extent to whichparticipants are uncertain about whether the conclusion is war-ranted by the logical form of x. In studies using abstract content,the response “maybe” or “don’t know” might be given in the caseof uncertainty. Where rich background knowledge is available,background knowledge suggests a default fall-back position insuch cases, namely to use the knowledge-based evidence as sum-marized in parameters �(C,x). Integrating both the extent of cer-tainty and the remaining uncertainty, the component contributedby the form-based mode of reasoning is thus �(x) 1 (1 ��[x]) �(C,x).2

This component and the contribution of the knowledge-basedmode of reasoning are integrated with weight factors and 1 � ,respectively, so that the predicted rating is

2 We also present so-called converse inferences with the opposite con-clusion. For example, the converse inference MP� for MP is thus: Given “ifp, then q” and p, it follows that not-q. Parameter � is defined, moreprecisely, as the subjective certainty, quantified on a probability scale, thatthe conclusion must either be accepted or rejected on logical grounds. Forthe original inferences, a conditional interpretation of if–then suggests toaccept MP and MT, and a biconditional interpretation suggests to accept allfour inferences. For the converse inferences, a conditional interpretationsuggests to reject the conclusions of MP� and MT�, and a biconditionalinterpretation suggests to reject the conclusions for all four converseinferences.

We assume that parameter � for the original inference has the same valueas parameter � for the converse inference. For example, consider MP andMP�: To the extent to which q is seen as following logically from p, not-qshould be rejected on logical grounds, leading to the assumption that�(MP) � �(MP�). For the converse inferences x�, the appropriate formulaintegrating certainty and remaining uncertainty is therefore �(x�) 0 (1 � �[x�]) �(C,x�) with �(x�) � �(x) and, based on the consistencyassumption already mentioned (the probability of not-q should be 1 minusthe probability of q), with �(C,x�) � 1 � �(C,x).

303A DUAL-SOURCE MODEL

Page 7: Conditional Reasoning

���x� � 1 � �1 � ��x�� � ��C,x�� � �1 � ���C,x�.

Thus, the knowledge parameters �(C,x) enter the dual-sourcemodel in two places: (a) in the component for the form-basedmode of reasoning as fall-back response when uncertain about theappropriate logical conclusion and (b) in the component for theknowledge-based mode of reasoning. The knowledge-based evi-dence exclusively drives ratings in conditions in which no rule ispresent in the first place.

Taken together, probability ratings, P(x), should be given by thefollowing equations for each inference x � MP, MT, AC, and DAand content C:

P�x�C, No rule is present� � ��C,x�,

P�x�C, A rule is present� � ���x� � �1 � ��x����C,x��

� �1 � ���C,x�. (2)

The dual-source model can be framed as a normative model ofBayesian model averaging (O’Hagan & Forster, 2004, Chapter 7).Details can be obtained from Karl Christoph Klauer.

In a first test, we fit this model to the data shown in the left twopanels of Figure 1 by minimizing the sum of the squared devia-tions of the data from the model predictions (i.e., using a least-squares objective) using an iterative gradient-based search to ob-tain best-fitting parameter estimates.

There are 48 data points, which were modeled by 14 parameters.Parameters � were a function of only the inferences and did notvary across the different contents C � HL, ML, LL, LH, LM, LL.Thus, there are four � parameters, one for each inference MP, MT,AC, and DA.

A special feature of Liu’s (2003) experiments was that theproblems labeled HL, ML, and LL used the same contents as theproblems labeled, in order, LH, LM, and LL, with the statementsreferring to p and q exchanged from one set of problems to theother set of problems. For example, an HL rule was “If H1 is 5years old, then H1 is a child,” where H1 stands for a boy’s or girl’sname; the corresponding LH rule was “If H1 is a child, then H1 is5 years old,” and as a consequence, the knowledge parameters �should be the same (taking the exchange of p for q and of q for pinto account) for the HL and LH problems.3 Thus, the six differentkinds of rules are based on only three different contents. For eachcontent C, there are four knowledge parameters �(C,x), one foreach inference x, which we reexpressed by three parameters a, b,and e per content C: a(C) � P( p), b(C) � P(q), and e(C) � 1 �P(q�p) using Oaksford et al.’s (2000) formulae (Equation 1) with-out the restriction that e should be small. As already explained, thisimplies the relatively mild restriction that the knowledge param-eters are consistent with a bivariate probability distribution for the2 2 contingency table crossing p and not-p with q and not-q.With three parameters a, b, and e per content, this results in a totalof nine parameters underlying the knowledge parameters �. Fur-thermore, there was one parameter for all problems with rule.

Model fit was not bad (R2 � .88), so that the model accountedfor almost 90% of the variance in the data.4 The weight given torule-based evidence was � .40, and thus, the rule when present hadonly a medium-sized impact on ratings. In fact, when is restricted tobe zero, R2 is still .77 so that roughly 75% of the variance is accountedfor by the knowledge parameters alone. The right panel of Figure 1

shows the estimates of the knowledge parameters � and the rule-based parameters �. For example, the knowledge parameters�(HL,x) for the HL content are shown as small squares for eachinference x � MP, MT, AC, and DA, connected by lines. It can beseen that their values were estimated to be high for the MP and MTinferences, but much lower, close to the 50% mark, for the AC andDA inferences. The knowledge parameters roughly follow thepattern of the original ratings without rule. The �(x) parameters forthe different inferences x are shown as points labeled �. Encour-agingly, the � parameters reproduce the pattern �(MP) � �(MT) ��(AC) � �(DA) suggested by studies on abstract conditionalreasoning.

In the concluding sections of the introduction, we discuss (a)how Oaksford et al.’s (2000) model alternatively deals with theeffects of rule presence and (b) consider the relationship of thedual-source model to Liu’s (2003) concept of so-called second-order conditionalization. Relationships to broader theories of con-ditional reasoning are considered in the General Discussion.

Effects of Rule Presence in Oaksford et al.’s (2000)Model

What is the effect of the conditional rule in probabilistic con-ditional reasoning, and how can it be characterized in precisequantitative terms? This is one question answered by the dual-source model. Oaksford et al.’s (2000) model provides an alterna-tive answer. As already mentioned, the idea is that probabilityratings of conditional inferences MP, MT, AC, and DA are givenby the formulae for the conditional probabilities of, in order,P(q�p), P(not-p�not-q), P( p�q), and P(not-q�not-p). These probabil-ities are expressed in terms of three parameters, a � P( p), b �P(q), and e � 1 � P(q�p), with the restriction that the parameter e,the exceptions parameter, should be small. As argued by Oaksfordand Chater (2007), the only effect of adding a rule should be toreduce parameter e: “It seems that the only effect the assertion ofthe conditional premise could have is to provide additional evi-dence that q and p are related, which increases the assessment ofP2(q�p) [i.e., of 1 � e]” (p. 164).

For the data from Figure 1, the model thus requires three aparameters, three b parameters, and three e parameters (estimatedwithout restriction) for the problems without rule, taking intoaccount that the six different kinds of problems are based on only

3 The inferences MP, MT, AC, and DA for the rule “If p then q” eachpresent the same minor premise and conclusion as the inferences, in order,AC, DA, MP, and MT for the rule “If q then p.” The knowledge parametersare defined only in terms of minor premise and conclusion, whereas therule plays no role. This means that exchanging p and q changes the labelsfor the inferences, but not the parameter values. Considering, for example,the HL rule and the LH rule generated from it by exchanging p and q, itfollows that �(HL,MP) � �(LH,AC), �(HL,MT) � �(LH,DA),�(HL,AC) � �(LH,MP), and �(HL,DA) � �(LH,MT).

4 The relatively high R2 values for the dual-source model and Oaksfordet al.’s (2000) model reported in this article should be interpreted inrelation to the relatively high ratio of parameters to data points; forexample, there were 14 parameters to account for 48 data points in Liu’s(2003) experiments. This issue does not affect, however, the comparisonbetween our model and Oaksford et al.’s model. It is addressed in moredetail in Experiment 4.

304 KLAUER, BELLER, AND HUTTER

Page 8: Conditional Reasoning

three different sets of contents. Six new e parameters are requiredfor problems with rule to capture the effects of the rules, remem-bering that the rules were different, if sometimes only in the orderof p and q, across the six kinds of problems. Because the effects ofthe rule are to depress the exceptions parameter according toOaksford and Chater (2007), the new parameters e were con-strained to assume values smaller than or equal to the correspond-ing parameter estimated from the problems without rule. Themodel is thus based on 15 parameters when applied to reanalyzeLiu’s (2003) data. Although less parsimonious than the dual-source model (14 parameters) in terms of number of parameters,the probabilistic model fit somewhat less well with R2 � .83. Note,however, that Oaksford et al. (2000) and Oaksford and Chater(Chapter 5) argued that rules such as the present LH rule, whichimply a � P( p) � b � P(q), are reinterpreted. In particular, P(q)is adjusted upward in the presence of the rule to meet the model’srestriction that a(1 � e) � b. This subtracts from the normative-ness of the model in that there are many ways in which the modelparameters could be adjusted to meet the restriction a(1 � e) � bimposed by the probability calculus, with probability calculusitself being silent about the normatively appropriate manner ofadjusting model parameters. Nevertheless, the model remains apsychologically viable theory, and when we admitted an additionaland separate b parameter for the problematic LH problems withrule, the model with 16 parameters approached the model fit of thedual-source model with 14 parameters (R2 � .85).

Clearly, Oaksford et al.’s (2000) model is a viable alternative tothe dual-source model. Its conceptual appeal is that it does notpostulate two qualitatively distinct sources of information, onebased on prior knowledge and a second one based on form-basedevidence. Because of the conceptual simplicity, it would probablybe preferred over the dual-source model at this point, given that thedata from Figure 1 are not strong enough to permit a firm decisionfor or against one of the two models. Another limitation of thisreanalysis is that with few exceptions, it is not legitimate to testnonlinear models such as Oaksford et al.’s and our model on dataaveraged across persons (Rouder, Lu, Morey, Sun, & Speckman,2008).

The Dual-Source Model and Second-OrderConditionalization

In interpreting his data, Liu (2003) proposed that for problemswithout rule, participants base responses on the conditional prob-ability of the conclusion given the minor premise (first-orderconditional probability) and on the conditional probability givenminor premise and major premise (i.e., the rule) for problems withrule (second-order conditional probability). Conditionalizing onboth minor and major premise simultaneously is called second-order conditionalization.

One way to look at the dual-source model is to see it asspecifying an explicit model for the second-order conditionalprobabilities; they are given by the component contributed by theform-based mode of reasoning: For that component, both minorpremise and major premise are considered true and given. Incontrast, the component contributed by the knowledge-based modeof reasoning reflects the first-order conditional probabilities: Forthat component, only the minor premise is considered true andgiven, and the major premise plays no role. By formulating an

explicit model of the second-order conditional probabilities, thedual-source model isolates, and allows one to estimate, a specifi-cally rule-based component in the parameters �.

According to Liu (2003), ratings for problems with rule directlyreflect these second-order conditional probabilities. In terms of thedual-source model, this translates into the claim that the weightparameter for the form-based component is 1 and that for theknowledge-based component, 0. This implies that reliance onthe rule should be perfect when a rule is asserted. For example, inthe presence of a rule, there should be little room for effects ofinstructions that emphasize the rule over and above the degree thatis already implemented in standard instructions as used by Liu, aprediction that we test in the present Experiment 1. Note, however,that Liu also assumed that participants would not always be able tocompute the appropriate second-order conditional probabilities,especially for MT inferences, and that content and instructionsmight have an effect on how accessible the second-order condi-tional probabilities are.

Outlook

In four experiments, we test the dual-source model empirically.All experiments employ several phases in which the same contentsare presented. In a baseline phase, problems are presented withoutrule; in the other phases problems are presented with rule. Asalready mentioned, a baseline phase without rule is needed as acontrol condition to allow one to assess the effects of the condi-tional rule in probabilistic conditional reasoning and as a means toassess how participants respond to the different problems whenresponses can be based only on relevant background knowledge,but not on logical form. In particular, the dual-source model andOaksford et al.’s (2000) model make the same predictions forratings without rule (because they use the same equations), butthey differ in their predictions for the effects of adding a rule. Tocompare the two models, it is therefore necessary to contrastratings for problems with rule and those for problems without rule.In all experiments, phases were separated by at least 1 week foreach participant. Like in Liu (2003), prior knowledge was system-atically manipulated through the use of HH, HL, LH, and LLconditionals in Experiments 1 to 3.

In Experiment 1, Phases 2 and 3 furthermore differed in theemphasis put upon the rule. This leads to two major predictions forthe parameter estimates of the dual-source model: (a) The manip-ulation of prior knowledge should systematically affect the knowl-edge parameters �, and (b) the emphasis put upon the rule shouldsystematically influence the relative weight of form-based evi-dence versus knowledge-based evidence. Experiment 1 also al-lowed us to compare the fit of the dual-source model and Oaksfordet al.’s (2000) model of probabilistic inference. Experiment 2 is acontrol experiment designed to defend some of our proceduralchoices.

In Experiment 3, prior knowledge was again manipulated. Inaddition, we compared two conditional rules differing in form,namely the “if p then q” rule and the “p only if q” rule. Both ruleswere presented with the same contents in different phases of theexperiment. The major predictions were (a) that prior knowledgeshould again be mapped on the knowledge parameters, whereas (b)that the form of the conditional rule should affect the parameters �for the form-based evidence.

305A DUAL-SOURCE MODEL

Page 9: Conditional Reasoning

Finally, in Experiment 4, the so-called negations paradigm wasemployed. In the negations paradigm, four rules are presented foreach content. The four rules differ in whether antecedent and/orconsequent are affirmed or negated as elaborated below. We testedwhether the model accounts for so-called polarity bias, the majoreffect emerging in the negations paradigm. Polarity bias was theeffect targeted by Oaksford et al. (2000) in the original expositionof their model.

Experiment 1

Experiment 1 used four contents taken from Verschueren et al.(2005a), pretested to be either high or low in sufficiency of p forq and simultaneously either high or low in perceived necessity ofp for q (Verschueren et al., 2005a, Appendix 1). Presented as “if pthen q” rules, the four contents were

HH: If a predator is hungry, then it will search for prey.

HL: If a balloon is pricked with a needle, then it will pop.

LH: If a girl has sexual intercourse, then she will be pregnant.

LL: If a person drinks lots of Coke, then that person will gainweight.

Problems were presented in three phases of 32 problems each,separated by at least 1 week to reduce trivial carry-over effectsfrom phase to phase. The 32 problems were generated by present-ing the minor premise and conclusion of each conditional infer-ence MP, MT, AC, and DA with rule left out (Phase 1) or with ruleincluded (i.e., the complete logical form; Phases 2 and 3) for eachcontent, resulting in 16 problems per phase. The remaining 16problems presented the negation of the conclusion for each infer-ence, or in Oaksford et al.’s (2000) terms, the converse inferencesMP�, MT�, AC�, and MT� for each content. The task was to rate theprobability of the conclusion on a percent scale ranging from 0 to100.

For example, the MP problem without rule (Phase 1) for contentHL would present the observation that a balloon is pricked with aneedle and ask for the probability that the balloon will pop. TheMP� problem would present the same premise and ask for theprobability that the balloon will not pop. Presenting the traditionalinferences along with the converse inferences has several advan-tages: It allows us to assess the consistency of the individualratings in that the ratings for the original inference and its converseshould add up to approximately 100% according to normativeprescriptions. Consistency is a basic prerequisite for any model ofconditional inference stated in terms of a probability metric. Fur-thermore, by averaging over both ratings (with the rating r for theconverse inference entered as 100 � r), it allows us to mitigate theeffects of relatively superficial response biases such as differentialtendencies to endorse conclusions, whatever the inference, orgeneral tendencies to prefer forward inferences (from p or not-p toq or not-q), whatever the inference, over backward inferences(from q or not-q to p or not-p) as might be induced by a shallow“if” heuristic.

For Phase 1 problems, participants were told that they would seean observation and that they were to rate how probable it is that acertain conclusion drawn from it would hold. Phases 2 and 3

presented the problems with rule. Phases 2 and 3 differed fromeach other in the degree to which the validity of the rule wasstressed. For Phase 2 problems, participants were told that a rulehad been stated, that they would see the rule and an observation,and that they were to rate for each problem how probable it is thata certain conclusion drawn from this information would hold.

Immediately preceding Phase 3, participants were to rate theirbelief in the validity of each rule in order to give them an oppor-tunity to express that some of the rules were less plausible thanothers. For Phase 3 problems, participants were then told that theywould see a rule and an observation and that they were to take therule as valid without exception. They were asked to rate for eachproblem how probable it is that a certain conclusion drawn fromthis information would hold, taking the rule as valid and given theobservation.

Thus, we manipulated the factors (a) prior knowledge by meansof the different contents and (b) emphasis on the rule by means ofthe different instructions. The dual-source model was evaluated bythe extent to which it met the following predictions:

Prediction 1: The dual-source model should fit the datasatisfactorily, and it should provide better fit than Oaksford etal.’s (2000) model of probabilistic inference. That model canbe directly applied to our data by assuming that rule presencein Phase 2 depresses the exceptions parameters e, and thatemphasis on the rule in Phase 3 may further decrease theseparameters.

Prediction 2: The effects of content should be mapped on theknowledge parameters �. In particular, knowledge parametersfor MP and MT should be higher for contents high rather thanlow in perceived sufficiency of p for q. Simultaneously,knowledge parameters for AC and DA should be higher forcontents with high rather than low perceived necessity.

Prediction 3: Rule emphasis should affect the weight forform-based evidence relative to knowledge-based evidence.Thus, should be higher in Phase 3 than in Phase 2.

Prediction 4: Taking the results from reasoning with abstractconditionals as a guideline, the � parameters should be or-dered across inferences approximately as �(MP) � �(MT) ��(AC) � �(DA).

Method

Participants. Participants were 15 University of Freiburg stu-dents (four male, 11 female, with age ranging from 18 to 26 years)with different majors, excluding majors that imply formal trainingin logic such as mathematics. Each participant was requested to gothrough three phases of the experiment with different phasesseparated by at least 1 week. Participants received monetary com-pensation of €14.

Procedure. In each phase, the 32 problems described abovewere presented in two blocks of 16 problems in individual ses-sions. Between blocks, participants had an opportunity to take abreak. The order of problems was newly randomized for eachphase and participant with the restrictions (a) that there were fourproblems per content in each block, (b) that an inference and itsconverse were assigned to different blocks in each phase, and (c)

306 KLAUER, BELLER, AND HUTTER

Page 10: Conditional Reasoning

that within each block problems were blocked by content. Wepresented problems blocked by content because pretests suggestedthat a random sequence of contents was seen as very demandingand confusing.

Participants were told that no two problems presented wereidentical and that they were to read each problem carefully becausesome of the problems would differ only in small, but important,details. Furthermore, the words signaling negated statements suchas “not” or “no” were presented in capital letters, because pretestssuggested that these particulars were sometimes overlooked.

Following the two blocks of each phase, the computer programpresenting the experiment compared responses for each inferenceand its converse. Problem pairs for which responses were veryinconsistent, defined as a sum of ratings less than 40 or larger than160, were then presented once more in a third block, if any suchproblem pairs existed. Remember that responses were given inpercentages and that the ratings for an inference and its converseshould add to a value of approximately 100. Sums smaller than 40or larger than 160 imply that both the original inference and itsconverse had been rated as highly improbable or highly probable.Prior to the third block, participants learned by means of anexample using a content different from the ones used in theproblems themselves that they had rated both an inference and itsconverse as either highly probable or highly improbable. Partici-pants were asked to work through such problems once more,paying particular attention to the details of the wording. The thirdblock, if it was necessary, presented problem pairs of an inferenceand its converse in immediate succession.

Results and Discussion

Degrees of freedoms of F tests of analyses of variance withrepeated measures are Greenhouse–Geisser corrected throughoutthis article.

Consistency. As in previous research (Oaksford et al., 2000),participants were reasonably highly consistent with respect to theratings given to an inference and its converse over the 48 � 3 16 problem pairs presented. Leaving out the data from the thirdblock in each phase, the upper left panel of Figure 2 shows for eachparticipant the 10%, 25%, 50% (median), 75%, and 90% quantilesfor the sum of the ratings given to the inference and converse aswell as the correlations between the two ratings, with that for theconverse inverted. Correlations are shown as characters “x” withvalues multiplied by 100. Quantiles close to 100 and high corre-lations indicate high consistency of the ratings. It can be seen thatparticipants’ ratings appear to be fairly well calibrated in that themedian (third line from bottom) is always close to 100, and thequartiles (second and fourth lines from bottom) are also close to100. Correlations tend to be reasonably high (M � .86, SD � .12).

Summed over Phases 1 to 3, an average of 1.4 problem pairs(SD � 1.92, range 0 to 6) were repeated in one of the third blocks.For the analyses below, ratings from the third block replacedratings from the first two blocks, increasing consistency evenfurther. Thereafter, there remained one problem pair for one indi-vidual with sum of ratings outside the range of 40 to 160; ratingsfor this problem pair for this individual were treated as missingvalues for the model analyses.

Rating data. We averaged the two ratings of each problempair, with the rating r given to the converse problem mirrored at

50% (i.e., with r replaced by 100 � r), resulting in 48 data pointsper participant. The first three panels of Figure 3 show the meanratings obtained as a function of inference and content. As can beseen, the contents differed as expected: Contents with high ratherthan low sufficiency of p for q (HH and HL vs. LH and LL)received the highest MP and MT ratings, whereas contents withhigh rather than low necessity of p for q (HH and LH vs. HL andLL) received the highest AC and DA ratings. The effects of addinga rule were to compress ratings toward higher levels with strongereffects on MP and MT than on AC and DA. Note that within eachinference, the order of ratings over contents remained the same inmost cases across phases. The effects are thereby qualitativelysimilar to those observed by Liu (2003), but the effects of rulepresence seem more pronounced in our data.

Prediction 1. For the model analyses, the 48 data points in thefirst three panels of Figure 3 were fit by the dual-source model foreach participant separately with 18 parameters per participant.These comprise 4 � parameters (for inferences MP, MT, AC, andDA), 12 parameters underlying the knowledge parameters � (3parameters, a[C], b[C], and e[C], per content), as well as 2 parameters, 1 for Phase 2 with rule and 1 for Phase 3 with ruleemphasis. Parameter was set zero for Phase 1 because there wasno rule in Phase 1.

A weighted least-squares objective function was minimized foreach participant’s data, with weights reflecting the consistencyinformation that is available for each data point.5 Goodness of fitwas acceptable with mean R2 � .92 (SD � .048).

We also fit Oaksford et al.’s (2000) model. This model requires20 parameters. These comprise 4 a parameters; 4 b parameters; aswell as 4 e parameters for the problems without rule; 4 newexceptions parameters e, 1 per content, to account for the effects ofadding a rule in Phase 2; and 4 further exceptions parameters e toaccount for the effects of emphasizing the rule in Phase 3. Theresulting model fit less well than the dual-source model despite theuse of more parameters (mean R2 � .74, SD � .082). Remember,however, that Oaksford et al. suggested to modify the model forrule LH, which implies a � b, permitting the model to adjust bupward to meet the requirement that a(1 � e) � b. When anadditional parameter b was admitted for Phases 2 and 3 for the ruleLH, model fit was acceptable (mean R2 � .88, SD � .053) with 21parameters. In both cases, model fit was, however, significantlyworse than for the dual-source model: t(14) � –7.70, p � .01, forcomparing the unmodified Oaksford et al. model and the dual-source model, and t(14) � –3.00, p � .01, for comparing themodified model and the dual-source model. The dual-source model

5 Each data point r was based on averaging two ratings, one for theoriginal inference, ro, and one for its converse, rc, with the converse ratingmirrored at 50%, that is, r � (ro [100 � rc])/2. The ratings for a pair ofan inference and its converse are consistent with each other to the extent towhich ro rc � 100. Weights for the weighted least-squares analysis weregiven by � � (60 � �100 � [ro rc]�)/60. Remembering that data pointswith ro rc outside the interval between 40 and 160 are treated as missingvalues, these weights range between 0 and 1. They are large to the extentto which the two ratings ro and rc sum to 100. The goodness-of-fit indexR2 in weighted least-squares is the coefficient of variation that quantifiesthe amount of variance in the weighted data that is accounted for by themodel.

307A DUAL-SOURCE MODEL

Page 11: Conditional Reasoning

x

xx x

x x x x x x x x x x x

Participant

Con

sist

ency

Experiment 1

020

4060

8010

012

014

016

018

0

x

xx

x x x x x x x x x x x x x x x x x x x x x x x x x

Participant

Con

sist

ency

Experiment 2

020

4060

8010

012

014

016

018

0

x

x

x x x x x x x x x x x x x x x x

Participant

Con

sist

ency

Experiment 3

020

4060

8010

012

014

016

018

0

xx x x x

x x x x x x x x

Participant

Con

sist

ency

Experiment 4

020

4060

8010

012

014

016

018

0

Figure 2. Consistency data for participants in Experiments 1, 2, 3, and 4 based on the sums of ratings for aninference and its converse. The lines show, in order from bottom line to top line, the 10%, 25%, 50% (median),75%, and 90% quantiles of these sums as well as correlations between ratings for an original inference (i.e., oneof MP [modus ponens], MT modus [tollens], AC [affirmation of the consequent], and DA [denial of theantecedent]) and its converse inference, with the latter mirrored at 50%. Correlations are shown as “x” markswith values multiplied by 100. Participants are ordered so that correlations increase from left to right.

308 KLAUER, BELLER, AND HUTTER

Page 12: Conditional Reasoning

fit the data of 14 of the 15 participants better than the modifiedmodel by Oaksford et al. Thus Prediction 1 could be upheld.

Prediction 2. The knowledge parameters are also shown inFigure 3 (see lower right panel). As can be seen, they roughlyreflect the pattern seen in the ratings in the first phase without rule.In particular, parameter estimates reflected the factors perceivedsufficiency and perceived necessity. Knowledge parameters forMP and MT problems were higher for contents high in perceivedsufficiency (M � 90.33; values are given in percentages) than forcontents low in perceived sufficiency (M � 50.23), t(14) � 8.12,p � .01. Knowledge parameters for AC and DA problems werehigher for contents high in perceived necessity (M � 91.49) thanfor contents low in perceived necessity (M � 53.90), t(14) �12.27, p � .01. Prediction 2 could thus be upheld.

Prediction 3. The mean weight parameter for Phase 2 (withrule) was .68 (SD � .35) and thus somewhat higher than theweight parameter estimated for the Liu (2003) data. In Phase 3(rule emphasized), it was significantly increased (mean � .98,

SD � .05), t(14) � 3.46, p � .01. Note that parameter is the onlyparameter that varies between Phases 2 and 3; the model therebyprovides a very parsimonious description of the differences be-tween these two phases. Prediction 3 could be upheld.

Prediction 4. The mean � parameters are also shown in Fig-ure 3 (see bottom right panel). As can be seen, they follow theexpected order of MP � MT � AC � DA. An analysis of varianceof the � parameters with factor inference (MP, MT, AC, and DA)found a main effect of inference, F(1.52, 21.28) � 9.44, p � .01.Planned contrasts revealed that � for the MP inference did notsignificantly exceed � for MT, F(1, 14) � 1.17, p � .30, whereas� for the MT inference exceeded the mean � for the two invalidinferences AC and DA, F(1, 14) � 12.40, p � .01. There were nosignificant differences between AC and DA, F(1, 14) � 1.43, p �.25.

Inspecting the � parameters for the different participants indi-vidually, some of the participants had equally high � parametersfor all inferences, suggesting that they followed a biconditional

Pro

babi

lity

Rat

ing

Without Rule

3040

5060

7080

9010

0

MP MT AC DA

Pro

babi

lity

Rat

ing

With Rule

3040

5060

7080

9010

0

MP MT AC DA

Pro

babi

lity

Rat

ing

Rule Emphasized

3040

5060

7080

9010

0

MP MT AC DA

HHHLLHLL

Par

amet

er E

stim

ates

(in

Per

cent

)

Model Parameters

ττ

τ

τ

3040

5060

7080

9010

0

MP MT AC DA

Figure 3. Mean ratings and parameter estimates for Experiment 1. The top left, top right, and bottom left panelsshow, in order, mean ratings for problems without rule (Phase 1), for problems with rule (Phase 2), and forproblems with emphasis on the rule (Phase 3) as a function of content (HH, HL, LH, and LL) and inference (MP[modus ponens], MT modus [tollens], AC [affirmation of the consequent], and DA [denial of the antecedent]).The bottom right panel shows mean parameter estimates (multiplied by 100) of the knowledge parameters andof the � parameters (with values marked as �) for the dual-source model. HH, HL, LH, and LL refer to contentspresented to be simultaneously either high (H) or low (L) in perceived sufficiency (first letter in the pair) andperceived necessity (second letter in the pair) of p for q.

309A DUAL-SOURCE MODEL

Page 13: Conditional Reasoning

interpretation of rule-based evidence. On balance, Prediction 4could be upheld.

Belief ratings. Mean rule believability for contents HH, HL,LH, and LL was, in order, 89 (SD � 11), 92 (SD � 7.5), 39 (SD �29), and 56 (SD � 18). The differences between contents weresignificant, F(1.84, 25.81) � 28.97, p � .01. In line with previouswork, the pattern of rule believability closely followed the patternfor MP ratings in Phase 1 as well as the pattern of knowledgeparameters for MP, �(C,MP) (see Figure 3).

Summary. Taken together, the dual-source model providedan adequate fit to a complex data structure. It fit significantly betterthan Oaksford et al.’s (2000) model despite using fewer parame-ters. The manipulated factors content, inference, and phasemapped in a meaningful fashion on the knowledge parameters, the� parameters, and the parameters.

Experiment 2

Experiment 2 was a control experiment intended to assess theeffects of procedural choices made in Experiment 1 and in thesubsequent experiments. In particular, (a) the baseline phase with-out rule always preceded the phases with rule, and (b) participantswere given an opportunity to reevaluate problem pairs for whichtheir ratings were highly inconsistent between original and con-verse inference. Obtaining baseline ratings without rule first, andratings for problems with rule second, seems a natural order ofpresenting problems. Furthermore, as already mentioned, consis-tency is a basic requirement for both models (the dual-sourcemodel and Oaksford et al.’s, 2000, model), and inconsistency isattributed to measurement error by both models. Measures topromote consistency therefore do not favor one model over theother. They serve to enhance the reliability of the data and therebythe accuracy of parameter estimates and the test power for dis-criminating between models.

Both procedural choices may, however, have undesired effectsthat would then limit the generalizability of the results. For exam-ple, there may be effects of the order in which problems with ruleversus problems without rule are presented on the ratings. Simi-larly, having an opportunity to reevaluate problem pairs withhighly inconsistent ratings may have an effect on ratings. InExperiment 2, we presented two phases, one without rule and onewith rule, but without special emphasis on the rule. We manipu-lated (a) presentation order of the two phases without and with rule(factor presentation order) and (b) whether participants had theopportunity to reevaluate problem pairs with highly inconsistentratings (factor inconsistency correction).

In defense of our procedural choices, we hoped that bothfactors—presentation order and inconsistency correction—would have little impact on ratings and model parameters, andwe expected the pattern of results to be the same as in Exper-iment 1. Note, however, that the dual-source model and Oaks-ford et al.’s (2000) model use the same equations to predictratings in problems without rule. There was thus only one phasein Experiment 2, the phase with rule, rather than two phases asin Experiment 1 for which the dual-source model and Oaksfordet al.’s model make different predictions. For this reason, thechances of discriminating between the dual-source model andOaksford et al.’s model were lowered in this experiment rela-tive to Experiment 1.

The major predictions were the following:

Prediction 1: Presentation order and inconsistency correctionshould have little effect on the ratings.

Prediction 2: The effects of content should again be mappedon the knowledge parameters �, and presentation order andinconsistency correction should have little effect on theseparameters.

Prediction 3: The � parameters should again be orderedapproximately according to MP � MT � AC � DA, andpresentation order and inconsistency correction should havelittle effect on the � parameters and on the parameter.

Method

Participants. The 28 participants (10 male, 18 female, withage ranging from 18 to 34 years) were 26 University of Freiburgstudents with different majors, excluding majors that imply formaltraining in logic such as mathematics, one high school student, andone apprentice.

Each participant was requested to go through two phases withdifferent phases separated by at least 1 week. Participants receivedmonetary compensation of €20 for participating.

Procedure. Procedures were the same as in Experiment 1, butthere were only two phases, one without rule and one with rule.Presentation order of the two phases was manipulated, as well aswhether participants had an opportunity to correct highly incon-sistent ratings in a third block in each phase (inconsistency cor-rection). In particular, the third blocks were simply omitted forparticipants in the groups without inconsistency correction. Thetwo factors were crossed, resulting in a balanced design with fourgroups of participants.

For problems with rule, the standard instructions were those thatwere employed for Phase 2 in the previous experiment; that is, therule was not specially emphasized. Participants were told that theywould go through two phases, one with rule and one without rule,but that the order in which the two phases were presented wasrandomly determined. Participants were also told that each prob-lem would be presented twice, once with affirmative conclusionand once with the conclusion negated, in order to permit a reliableassessment of the asked-for probability.

Results and Discussion

Consistency. Consistency information for the uncorrected rat-ings is shown in the upper right panel of Figure 2. Like inExperiment 1, participants’ ratings appear to be fairly well cali-brated. Correlations between the two ratings, with that for theconverse inverted, again tend to be reasonably high (M � .86,SD � .12).

Correlations were Fisher z transformed and submitted to ananalysis of variance with factors presentation order and inconsis-tency correction. This revealed no significant effects or interac-tions, largest F(1, 24) � 1.43, smallest p � .25.

Of the 28 32 � 896 problem pairs presented, 34 or 4% hadsums of ratings outside the interval (40,160). Half of the partici-pants had an opportunity to reevaluate such problem pairs, andratings from this third block replaced ratings from the first two

310 KLAUER, BELLER, AND HUTTER

Page 14: Conditional Reasoning

blocks for the model analyses; thereafter, 23 or 3% of the problempairs remained with sums of ratings outside the interval (40,160).Like in Experiment 1, these were treated as missing data for themodel analyses.

Rating data and Prediction 1. Ratings were aggregatedacross problem pairs of inference and converse inference as inExperiment 1, resulting in 32 data points per participant. We didthis twice, once with the uncorrected ratings from the first twoblocks and once with the ratings from the corrective third block,where presented, replacing the initial ratings. Both resulting datasets were submitted to analyses of variance with between-

participants factors presentation order and inconsistency correctionand within-participants factors content (four contents), inference(MP, MT, AC, and DA), and phase (without rule vs. with rule).Despite the many significance tests for effects and interactionsinvolving presentation order and inconsistency correction, none ofthese reached significance in either analysis (largest F � 2.42,smallest p � .08).

The mean ratings (with corrections where applicable) are shownin Figure 4 as a function of content and inference for problemswithout rule (left panel) and problems with rule (middle panel). Ascan be seen, the data show a pattern that is similar to that observed

Pro

babi

lity

Rat

ing

Without Rule

3040

5060

7080

9010

0

MP MT AC DA

Pro

babi

lity

Rat

ing

With Rule

3040

5060

7080

9010

0

MP MT AC DA

HHHLLHLL

Par

amet

er E

stim

ates

(in

Per

cent

)

Model Parameters

τ

τ

τ

τ

3040

5060

7080

9010

0

MP MT AC DA

Figure 4. Mean ratings and parameter estimates for Experiment 2. The left and middle panels show meanratings for problems without rule and with rule, respectively, as a function of content (HH, HL, LH, and LL) andinference (MP [modus ponens], MT modus [tollens], AC [affirmation of the consequent], and DA [denial of theantecedent]). The right panel shows mean parameter estimates (multiplied by 100) of the knowledge parametersand of the � parameters (with values marked as �) for the dual-source model. HH, HL, LH, and LL refer tocontents presented to be simultaneously either high (H) or low (L) in perceived sufficiency (first letter in the pair)and perceived necessity (second letter in the pair) of p for q.

311A DUAL-SOURCE MODEL

Page 15: Conditional Reasoning

by Liu (2003) and in Experiment 1 as a function of content andrule presence.

Prediction 2. The dual-source model was fit to the data fromeach individual separately. It comprises 17 parameters per person:4 � parameters (for inferences MP, MT, AC, and DA), 12 param-eters underlying the knowledge parameters � (3 parameters, a[C],b[C], and e[C], per content), as well as 1 parameter for the phasewith rule. Model fit was good (mean R2 � .95, SD � .057). Meanknowledge parameters are shown in Figure 4 (see right panel). Inan analysis of variance with factors presentation order, inconsis-tency correction, content, and inference, there were no significanteffects or interactions involving presentation order and/or incon-sistency correction on the knowledge parameters (largest F �2.92, smallest p � .07).

As can be seen in Figure 4, knowledge parameters (in percent-ages) for MP and MT problems were higher for contents high inperceived sufficiency (M � 93.87) than for contents low in per-ceived sufficiency (M � 59.78), t(27) � 14.73, p � .01. Knowl-edge parameters for AC and DA problems were higher for contentshigh in perceived necessity (M � 94.59) than for contents low inperceived necessity (M � 63.59), t(27) � 11.83, p � .01. In sum,Prediction 2 could be upheld.

Prediction 3. The mean � parameters are also shown in Fig-ure 4. In an analysis of variance with factors presentation order,inconsistency correction, and inference, there were no significanteffects or interactions involving presentation order and/or incon-sistency correction (largest F � 2.03, smallest p � .17). Like inExperiment 1, the main effect of inference was, however, signif-icant, F(2.39, 57.45) � 7.04, p � .01. Planned contrasts revealedthat � for the MP inference significantly exceeded � for MT, F(1,24) � 3.61, p � .07 ( p values reported in this article are two-tailed, but because the difference is in the predicted direction, thisp value can be halved for a one-tailed significance test, resulting inp � .035), and � for the MT inference exceeded the mean � for thetwo invalid inferences AC and DA, F(1, 24) � 5.11, p � .03. Likein Experiment 1, �(AC) and �(DA) did not differ from each othersignificantly (F � 1).

The mean parameter for the relative impact of form-basedevidence was .69 (SD � .36). The parameters were submitted toan analysis of variance with factors presentation order and incon-sistency correction. This revealed no significant effect or interac-tion, largest F(1, 24) � 1.42, smallest p � .25. Taken together,Prediction 3 could be upheld.

Belief ratings. Mean rule believability for contents HH, HL,LH, and LL was, in order, 92 (SD � 7.2), 93 (SD � 11), 29 (SD �21), and 59 (SD � 26). The differences between contents weresignificant, F(2.35, 63.40) � 97.11, p � .01. Like in Experiment1 and in line with previous work, the pattern of belief ratings againclosely follows that for MP in the knowledge parameters and thepattern of Phase 1 ratings for MP without rule (see Figure 4).

Fit of Oaksford et al.’s (2000) model. We again fit Oaksfordet al.’s model. It requires 16 parameters: 4 a parameters; 4 bparameters; and 4 e parameters, 1 for each content; as well as 4additional e parameters for the phase with rule. In addition, weagain fit the modified model with a separate parameter b for thecontent LH in the presence of a rule (see Experiment 1). Themodified model requires 17 parameters. The unmodified model fitsignificantly worse than the dual-source model (R2 � .81), t(27) �–5.42, p � .01. The modified model uses an additional parameter

and reaches a mean R2 of .93 that approaches that of the dual-source model (R2 � .95), t(27) � –1.53, p � .14.

Summary. The purpose of Experiment 2 was to defend cer-tain procedural choices implemented in Experiment 1 and in theexperiments that follow: (a) the fixed presentation order withbaseline ratings for problems without rule first, followed by ratingswith rule, and (b) having an opportunity to reevaluate problempairs with highly inconsistent ratings. None of the dependentvariables, neither ratings, nor model parameters, showed effects ofthe factors presentation order and inconsistency correction despitethe many significance tests conducted, and the results replicatedthose found for the first two phases of Experiment 1. It seemslikely that as the number of participants is increased, small effectsof presentation order or inconsistency correction would eventuallyemerge, but for the sample sizes employed in the present series ofexperiments, the effects appear to be negligible. On the basis ofExperiment 2, we felt justified in maintaining the tested proceduralchoices (a) and (b) in the subsequent experiments.

As already mentioned, the preferred presentation order (prob-lems without rule first, followed by problems with rule) is a naturalorder, and the reverse order, in which rules are present first andthen withdrawn in a second phase, seemed awkward. Furthermore,having an opportunity to reevaluate problem pairs with highlyinconsistent ratings does indeed increase consistency for the cor-rected ratings. When we computed the correlations between orig-inal and converse inferences (with the rating for the converseinferences inverted) on the basis of the corrected ratings for theparticipants with inconsistency correction and compared them withthe correlation for the uncorrected ratings, the correlations weretrivially unchanged for seven of 14 participants who did notproduce highly inconsistent ratings in the first place but increasedfor the other seven participants. Importantly, the present datasuggest that the increase in consistency and the associated increasein reliability and statistical test power do not come at the cost of asystematic bias in ratings or model parameters.

As expected, the data of Experiment 2 did not permit us todiscriminate between the dual-source model and Oaksford et al.’s(2000) model as clearly as those of Experiment 1.

Experiment 3

In Experiment 3, we focus on a manipulation of rule form.Specifically, we compared problems using “if p then q” rules withproblems using “p only if q” rules. The manipulation of rule formshould map on the � parameters given that it affects the logicalform of the arguments, but not the background knowledge aboutthe particular contents used in the rule statements.

A few studies have compared conditional inferences for if–thenand only–if, many of them using abstract materials and instructionsstressing logical necessity (as summarized by Evans & Over, 2004,Chapter 3). One finding is that only–if tends to lead to higher ratesof the backward inferences, AC and MT, whereas if–then tends tolead to higher rates of the forward inferences, MP and DA (withthe labels MP, MT, AC, and DA defined by the minor premise andconclusion in problems with “if p then q”). To the extent to whichthese directional biases also affect the forward and backwardconverse inferences (MP�, MT�, AC�, and DA�), they shouldcancel out in the present analyses, because we aggregate acrosseach pair of an inference and its converse with ratings for converse

312 KLAUER, BELLER, AND HUTTER

Page 16: Conditional Reasoning

inferences mirrored at 50%. In addition, the literature suggests thatonly–if more strongly stresses the necessity of q for p (Thompson& Mann, 1995), facilitating in particular the MT inference (andrejection of the MT� inference). For example, Braine (1978) ar-gued that “only X” means “no Y other than X” and in his view, therule “p only if q” is thereby frequently paraphrased as “If otherthan q, not p” or “If not q, then not p.” In a similar vein,Johnson-Laird et al. (1992) argued that only–if makes cases withnot-p and not-q salient, facilitating MT. We therefore expectedeffects of rule from on �(MT) with higher values for only–if thanfor if–then.

In Experiment 3, we used the same contents as in Experiments1 and 2, with p and q interchanged for the HL and LH contents, sothat these contents now became LH and HL contents, respectively.For exploratory reasons, we implemented a fifth content with arelatively unfamiliar topic. The corresponding if–then rule was “Ifthe level of oxytocin in the blood is increased, lactation is ele-vated.” We expected this content to elicit knowledge parameters ofintermediate levels, and we refer to it by the letter U in the figuresbelow.

Although the Oaksford et al. (2000) model has not been for-mally extended to deal with only–if, we tried to adapt it to the rule“p only if q” for the sake of comparison. According to the above,the rule “p only if q” seems to stress that cases without q, but withp, are the exception. Consequently, we chose the exceptions pa-rameter to model the probability P( p�not-q) for only–if (Oaksford& Chater, 2007, p. 155) rather than P(not-q�p) as for if–then, withparameters a and b unchanged. This has the effect to allow themodel to accommodate enhanced acceptance of MT for only–ifrelative to if–then.

The major predictions were as follows:

Prediction 1: The dual-source model will exhibit an adequatemodel fit, and it will provide a better description of the datathan the (adapted) model by Oaksford et al. (2000).

Prediction 2: The effects of content should again be mappedon the knowledge parameters �. We expected intermediateparameter values for the relatively unfamiliar content U.

Prediction 3: The � parameters should again be orderedapproximately according to MP � MT � AC � DA. Fur-thermore, rule form should have an effect on the � parameters.In particular, we expected �(MT) to be larger for only–if thanfor if–then.

There was little reason to expect an effect of rule form on theweight parameter, given our initial working assumption that theweight of form-based evidence is largely determined by the ex-perimental setting, and the instructions in particular. We allowedfor different weight parameters for only–if and if–then, however,to assess a potential effect of rule form on the use of form-basedevidence.

Method

Participants. Participants were 18 University of Freiburg stu-dents (four male, 14 female, with age ranging from 19 to 28 years)with different majors, excluding majors that imply formal trainingin logic such as mathematics. Each participant was requested to go

through three phases of the experiment with different phasesseparated by at least 1 week. Participants received monetary com-pensation of €14.

Procedure. Procedures were as in Experiment 1 with thefollowing changes: There were now five contents and hence 40problems generated by crossing contents and (original and con-verse) inferences. In Phase 1, these problems were presentedwithout rule. In Phases 2 and 3, the problems were presented withrule, using the standard instructions that did not specially empha-size the rule. In Phases 2 and 3, two of the five contents used onerule form (either if–then or only–if); the remaining three used theother rule form. Across Phases 2 and 3, if–then and only–if werecrossed with all five contents. Which contents and how many (twovs. three) were paired with if–then rather than only–if for Phase 2were randomized for each participant anew. Contents and ruleform were thereby counterbalanced across the two phases withrule. In each phase, 40 problems were presented in two blocks of20 problems.

Following the two blocks, the program again represented prob-lem pairs of an inference and its converse with sum of ratings lessthan 40 or larger than 160 in a third block. At the end of Phase 3,participants rated their belief in the validity of all 10 rules that theyhad seen in the course of the experiment.

Results and Discussion

Consistency. Consistency information for the uncorrected rat-ings is shown in the lower left panel of Figure 2. Participants’ratings again appear to be fairly well calibrated. Correlationsbetween the two ratings, with that for the converse inverted, tendto be reasonably high (M � .85, SD � .16).

Summed over Phases 1 to 3, an average of 2.4 of the 60 problempairs (SD � 2.66, range 0 to 10) were repeated in one of the thirdblocks. For the analyses below, ratings from the third block re-placed ratings from the first two blocks, increasing consistencyeven further. Thereafter, there remained four problem pairs in allthe individuals’ data with sum of ratings outside the range from 40to 160. These individual ratings were treated as missing data forthe model analyses.

Rating data. Ratings were aggregated across problem pairs ofinference and converse inference as in Experiment 1, resulting in60 data points per participant. The first three panels of Figure 5show the mean ratings as a function of inference and content. Ascan be seen, the contents differed as expected, and the unfamiliarcontent elicited ratings of an intermediate level. The effects ofadding a rule were to compress ratings toward higher levels withstronger effects on MP and MT than on AC and DA. The effectsof only–if on compressing MT ratings appear to be a little strongerthan the effects of if–then.

Prediction 1. For the model analyses, the 60 data points in thefirst three panels of Figure 5 were fit by the dual-source modelwith 25 parameters for each individual separately. These comprise4 � parameters for if–then and 4 for only–if; 15 parameters under-lying the knowledge parameters � (3 parameters, a[C], b[C], ande[C], per content); as well as 2 parameters, 1 for problems withif�then rule and 1 for problems with only–if rule. Parameter wasas always set zero for problems without rule. Model fit wasacceptable (mean R2 � .89, SD � .081).

313A DUAL-SOURCE MODEL

Page 17: Conditional Reasoning

The model by Oaksford et al. (2000) also required 25 parame-ters; 5 a parameters; 5 b parameters, one per content; as well as 5e parameters for the problems without rule; 5 new e parametersfor the problems with if–then; and 5 new e parameters for theproblems with only–if. The mean R2 was .78 (SD � .12), andthe difference between the two models was significant, t(17) �–3.16, p � .01. We again modified the Oaksford et al. model bypermitting a new b parameter for the LH rule; this model with26 parameters achieved a mean R2 of .83 (SD � .08), which wasagain significantly smaller than that of the dual-source model ina one-tailed test, t(17) � –1.74, p � .049. The dual-sourcemodel received higher R2 values than the modified Oaksford etal. model for 11 of the 18 participants. Prediction 1 can beupheld.

Prediction 2. The knowledge parameters are also shown inFigure 5, see bottom right panel. As can be seen, knowledgeparameters for MP and MT problems were higher for contents highin perceived sufficiency (M � 94.33) than for contents low in

perceived sufficiency (M � 61.71), t(17) � 9.54, p � .01. Knowl-edge parameters for AC and DA problems were higher for contentshigh in perceived necessity (M � 94.03) than for contents low inperceived necessity (M � 55.13), t(17) � 14.75, p � .01. Therelatively unfamiliar content elicited knowledge parameters ofintermediate level. Prediction 2 could thus be upheld.

Prediction 3. The mean � parameters are also shown inFigure 5 (see bottom right panel, lines with points labeled IT[for if–then] and OI [for only–if]). As can be seen, they followthe expected order with MP � MT � AC � DA. An analysisof variance of the � parameters with rule form (if–then vs.only–if) and inference as factors revealed a significant interac-tion, F(2.64, 44.80) � 4.46, p � .01. Individual t tests per-formed per inference showed as expected that �(MT) wassignificantly larger for only–if than for if–then, t(17) � 3.07,p � .01, all other �t� � 1.

Planned contrasts for the if–then rule furthermore revealed that� for the MP inference significantly exceeded the mean � for MT,

Pro

babi

lity

Rat

ing

Without Rule

3040

5060

7080

9010

0

MP MT AC DA

Pro

babi

lity

Rat

ing

With If−Then Rule

3040

5060

7080

9010

0

MP MT AC DA

Pro

babi

lity

Rat

ing

With Only−If Rule

3040

5060

7080

9010

0

MP MT AC DA

HHHLLHLLU

Par

amet

er E

stim

ates

(in

Per

cent

)

Model Parameters

IT

IT

IT IT

3040

5060

7080

9010

0

MP MT AC DA

OI

OI

OI

OI

Figure 5. Mean ratings and parameter estimates for Experiment 3. The top left, top right, and bottom left panelsshow, in order, mean ratings for problems without rule, problems with if–then rule, and problems with only–ifrule as a function of content (HH, HL, LH, LL, and U) and inference (MP [modus ponens], MT modus [tollens],AC [affirmation of the consequent], and DA [denial of the antecedent]). The bottom right panel shows meanparameter estimates (multiplied by 100) of the knowledge parameters and of the � parameters (with values forif–then [IT] and only–if [OI]) for the dual-source model. HH, HL, LH, and LL refer to contents presented to besimultaneously either high (H) or low (L) in perceived sufficiency (first letter in the pair) and perceived necessity(second letter in the pair) of p for q. U refers to content expected to elicit knowledge parameters of intermediatelevels.

314 KLAUER, BELLER, AND HUTTER

Page 18: Conditional Reasoning

F(1, 17) � 12.22, p � .01, whereas � for the MT inferencenonsignificantly exceeded the mean � for the two invalid infer-ences AC and DA (F � 1). Again, there were no significantdifferences between AC and DA (F � 1).

Inspecting the � parameters for the different participants indi-vidually, some of the participants again had equally high � param-eters for all inferences, suggesting that they adopted a bicondi-tional interpretation of the conditional statements. Prediction 3 canbe upheld.

Parameter �. Separate weights were estimated for the impactof form-based evidence for both rule forms. Mean parameters forif–then and only–if were .88 (SD � .19) and .88 (SD � .25),respectively. As expected, the difference between the two was notsignificant (�t� � 1). Compared to Experiment 1’s Phase 2 andExperiment 2, form-based evidence received somewhat higherweights in Experiment 3.

Belief ratings. Mean rule believability for contents HH, HL,LH, LL, and U was, in order, 88 (SD � 17), 94 (SD � 17), 41(SD � 34), 19 (SD � 18), and 78 (SD � 18) for the if–then rulesand 79 (SD � 18), 87 (SD � 30), 30 (SD � 31), 12 (SD � 16), and64 (SD � 25) for the only–if rules. An analysis of variance withfactors rule form and content revealed main effects of content,F(2.93, 41.36) � 46.89, p � .01, and rule form, F(1, 17) � 12.10,p � .01. The interaction was not significant (F � 1), so thatonly–if rules were simply rated somewhat less believable irrespec-tive of content. Like in Experiments 1 and 2 and in line withprevious work, the pattern of belief ratings again follows that forMP in the knowledge parameters and the pattern of baselineratings for MP (see Figure 5).

Summary. The dual-source model again provided a satisfac-tory account of the data. The effects of content mapped onto theknowledge parameters as in Experiment 1. Rule form had a veryfocused effect: Use of only–if selectively increased the � parameterfor MT as expected. Rule form had no impact on , the relativeweight for the form-based evidence. The dual-source modelthereby provides a very parsimonious account of the effects of ruleform.

Experiment 4

A classical paradigm in conditional reasoning is the so-callednegations paradigm (Evans & Lynch, 1973). It involves adminis-tering equivalent inferences on four conditional statements (AA,AN, NA, and NN) that differ in whether antecedent and/or con-sequent are affirmed (A) or negated (N):

AA: If p then q.

AN: If p then not-q.

NA: If not-p then q.

NN: If not-p then not-q.

One major finding emerging from this paradigm is that negativeconclusions tend to be accepted more frequently than affirmativeconclusions, an effect that is referred to as negative conclusion biasor polarity bias (Evans & Over, 2004). In studies using abstractcontent, the effect is usually more pronounced for MT and DAthan for MP and AC.

Oaksford et al. (2000) originally introduced their model toaccount for polarity bias in the negations paradigm. They arguethat negations define categories with higher probability thantheir affirmative counterparts. In this view, a statement such as“The student did not learn” is interpreted as a contrast setrelative to a superordinate category of possible activities sug-gested by the context in which the statement was uttered andthus as “The student engaged in possible activities other thanlearning.” The contrast set is regularly much larger than the setimplied by the affirmative statement and hence, likely to beseen as more probable. Other things being equal, this leads tothe prediction that MT, AC, and DA inferences with negatedconclusion should be accepted more frequently than these sameinferences with affirmative conclusion under Oaksford et al.’smodel. In other words, the negative conclusion bias is reinterpretedas a high-probability conclusion effect.

The negations paradigm provides a challenge for both Oaksfordet al.’s (2000) and our model when the same content C is used inthe four rules. It should then be possible to use the same knowl-edge parameters � and the same parameters a � P( p) and b � P(q)in Oaksford et al.’s model to describe the ratings obtained for thefour different rules (for reasons analogous to those explained inFootnote 3), taking into account in the model equations that a andb for rules with affirmative antecedent and consequent, respec-tively, become 1 � a and 1 � b for rules with negated antecedentand consequent. This situation is challenging for both modelsbecause it means that it should be possible to model the data for allfour rules using only a few parameters. The ratio of parameters todata points was, in order, 18:48, 17:32, and 25:60 for Experiments1, 2, and 3 for the dual-source model, and 21:48, 17:32, and 26:60for the model by Oaksford et al. In the present experiment, theseratios were substantially smaller, 17:80 for the dual-source modeland 28:80 for Oaksford et al.’s model as explained below.

We chose four rule contents such that each of the rules AA, AN,NA, and NN was especially plausible for (at least) one content.The contents and plausible rules were as follows:

Balloon: If a balloon is pricked with a needle, then it will pop(AA rule).

Car: If the battery is empty, then the car will not start (ANrule).

Exam: If a student has not learned, then he will fail the exam(NA rule).

Pregnancy: If a girl has not had sexual intercourse, then shewill not be pregnant (NN rule).

Note that each content was used in all four kinds of rules. Forexample, for the balloon context, we also presented the AN rule,“If a balloon is pricked with a needle, then it will not pop”; the NArule, “If a balloon is not pricked with a needle, then it will pop”;and the NN rule, “If a balloon is not pricked with a needle, then itwill not pop.” We also designated four rules as implausible; thesewere the above rules with the polarity of the consequent reversed.The purpose of this partial classification of rules as plausibleversus implausible was to ensure that participants would never

315A DUAL-SOURCE MODEL

Page 19: Conditional Reasoning

work through a phase of exclusively plausible or exclusivelyimplausible problems (see procedures below).

The predictions were as follows:

Prediction 1: The dual-source model will exhibit an adequatemodel fit, and it will provide a better description of the datathan the model by Oaksford et al. (2000).

Prediction 2: The ratings predicted by the dual-source modelwill adequately reproduce any polarity bias present in theobserved ratings.

Prediction 3: The � parameters should again be orderedapproximately as MP � MT � AC � DA.

Method

Participants. Participants were 13 University of Freiburg stu-dents (four male, nine female, with age ranging from 20 to 42years) with different majors, excluding majors that imply formaltraining in logic such as mathematics. Each participant was re-quested to go through five phases of the experiment with differentphases separated by at least 1 week. Participants received mone-tary compensation of €21.

Procedure. Procedures were as in Experiment 1 with thefollowing changes: Phase 1 was again the baseline phase in which32 problems were presented without rules. These 32 problemswere generated by crossing contents and (original and converse)inferences defined relative to the AA rules with rule omitted. InPhases 2 to 5, the problems were presented with rules. Rule kind(AA, AN, NA, and NN) was crossed with content, generating 16rules. Each phase presented problems for four of the 16 rulesselected randomly with the following restrictions: In each phase,all four contents and all four rule kinds occurred, as did one of thefour rules designated plausible and one of the four rules designatedimplausible. Contents and rule kind were thereby counterbalancedacross the four phases with rule.

Results and Discussion

Consistency. Consistency information is shown in the lowerright panel of Figure 2. Participants’ ratings again appear to befairly well calibrated. Correlations between the two ratings, withthat of the converse inverted, tend to be reasonably high (M � .87,SD � .09).

Summed over Phases 1 to 5, an average of 1.7 of the 80 problempairs (SD � 2.39, range 0 to 9) were repeated in one of the thirdblocks. For the analyses below, ratings from the third block re-placed ratings from the first two blocks, increasing consistencyeven further. Thereafter, there remained 22 problem pairs (2% ofthe data) in all the individuals’ data with a sum of ratings outsidethe interval (40,160). These individual ratings were treated asmissing data in the model analyses.

Rating data. Ratings were aggregated across problem pairs ofinference and converse inference as in Experiment 1, resulting in80 data points per participant. Figure 6 shows the mean ratings asa function of inference and content. The upper left panel shows theratings in the baseline phase without rule; the panels other than thebottom right panel show the ratings for the different rule kinds.The inferences are defined relative to the AA rules in the baseline

phase and relative to the relevant rule for the other panels. Forexample, in the panel showing the results with NA rules, if not-pthen q, MP presents not-p and q as minor premise and conclusion,respectively. This makes it somewhat more difficult to comparethe panels among each other than in the previous experiments.

Plausibility check. For each content, one of the four rules wasa priori designated as plausible and one as implausible (there wasno a priori classification of the remaining two rules as eitherplausible or implausible). For the four contents (exam, car, bal-loon, and pregnancy), mean rule believability for the plausiblerules was, in order, 71 (SD � 23), 98 (SD � 5), 91 (SD � 27), and93 (SD � 13); for the implausible rules the order was 11 (SD �21), 5 (SD � 18), 2 (SD � 6), and 1 (SD � 2). The differencebetween plausible and implausible rule was significant for eachcontent, smallest t(12) � 8.58, all ps � .01.

Rule believability was again highly predictive of MP ratingsacross the 16 rules; the correlation between mean believability andmean MP rating was .85 across the 16 rules.

Prediction 1. For the model analyses, the 80 data points in thefive panels of Figure 6 (other than the lower right panel) were fitby the dual-source model with 17 parameters for each individualseparately. These comprise 4 � parameters, 12 parameters under-lying the knowledge parameters � (3 parameters, a[C], b[C], ande[C], per content), as well as 1 parameter for the relative weightof the form-based evidence in problems with rule. Model fit wasacceptable (mean R2 � .79, SD � .08), taking into account thecomparatively low ratio of parameters to data points.

The model by Oaksford et al. (2000) requires 28 parameters; 4a and 4 b parameters as well as 4 e parameters for problemswithout rule and 16 new e parameters, 1 for each of the 16 rules.A separate exceptions parameter is necessary for each content andrule kind because the cases that are exceptions differ as a functionof rule kind and their perceived likelihood as a function of contentaccording to that model. The mean R2 was .78 (SD � .13), and thedifference between the two models was not significant (t � 1).Thus, the dual-source model achieves a level of fit that is equiv-alent to that of the Oaksford et al. model, although it uses only 17rather than 28 parameters. In comparing models, level of fit andparsimony of the model (in terms of number of parameters) mustboth be weighed (e.g., Myung, 2000), and hence it is fair toconclude that the dual-source model provides the better descriptionof the data on the basis of its much greater parsimony (see alsoFootnote 6).

Given that the Oaksford et al. (2000) model at this point alreadyrequires many more parameters than the dual-source model, we didnot fit a modified version admitting additional b parameters forsome of the rules as was done in the previous experiments: It istrivial that the goodness of fit of the model improves as more andmore additional parameters are added.

The lower right panel of Figure 6 also shows the model param-eters under the dual-source model. As can be seen, the knowledgeparameters (gray lines, coded relative to the AA rules) againroughly reflect the pattern of ratings in the baseline phase. The �parameters are discussed below. Parameter , the relative weightof the form-based evidence, was .81 (SD � .31).

Prediction 2. Figure 7 shows polarity effects in ratings andmodel predictions. Ratings and model predictions were collapsedacross rules and contents separately for each inference and con-clusion polarity (affirmative vs. negated), leaving out problems

316 KLAUER, BELLER, AND HUTTER

Page 20: Conditional Reasoning

presented without rule. It can be seen that there is a negativeconclusion bias: Inferences with negated conclusions are rated asmore probable than inferences with affirmative conclusions asexpected. An analysis of variance of the observed ratings withfactors polarity and inference revealed a main effect of polarity,F(1, 12) � 22.32, p � .01, and a main effect of inference, F(1.44,17.33) � 8.52, p � .01. The polarity effect was individuallysignificant in one-tailed t tests for MP and DA inferences, asmarked by asterisks in Figure 7 (left panel). In studies usingabstract content, polarity biases are often strongest on MT and DA,

but there are few studies of polarity bias with materials for whichprior knowledge is available.

This same pattern of polarity bias was also reproduced by thedual-source model (middle panel of Figure 7) as well as by theOaksford et al. (2000) model. That is, there were significantmain effects of polarity for the predictions under both models,both Fs(1, 12) � 11.56, both ps � .01, and individuallysignificant polarity effects for MP and DA. As can be seen inFigure 7, the main effect of inference was correctly reproducedby the dual-source model, and it was significant in the ratings

Pro

babi

lity

Rat

ing

Without Rule

010

2030

4050

6070

8090

100

MP MT AC DA

Pro

babi

lity

Rat

ing

With AA Rule

010

2030

4050

6070

8090

100

MP MT AC DA

ExamCarPregnancyBalloon P

aram

eter

Est

imat

es (i

n P

erce

nt)

With AN Rule

010

2030

4050

6070

8090

100

MP MT AC DA

PlausibleImplausible

Pro

babi

lity

Rat

ing

With NA Rule

010

2030

4050

6070

8090

100

MP MT AC DA

Pro

babi

lity

Rat

ing

With NN Rule

010

2030

4050

6070

8090

100

MP MT AC DA

Par

amet

er E

stim

ates

(in

Per

cent

)

Model Parameters

τ

ττ

τ

010

2030

4050

6070

8090

100

MP MT AC DA

Figure 6. Mean ratings and parameter estimates for Experiment 4. The top left, top middle, top right, bottomleft, and bottom middle panels show, in order, mean ratings for problems without rule, problems with AA rule,with AN rule, with NA rule, and with NN rule. In each panel with rule, ratings for the rule designated plausibleare shown along the bold line, and ratings for the rule designated implausible are along the gray line. The bottomright panel shows mean parameter estimates (multiplied by 100) of the knowledge parameters and of the �parameters (with values marked as �) for the dual-source model. AA, AN, NA, and NN refer to conditionalstatements that differ in whether antecedent and/or consequent are affirmed (A) or negated (N). MP, MT, AC,and DA refer to the four conditional inferences (respectively, modus ponens, modus tollens, affirmation of theconsequent, and denial of the antecedent).

317A DUAL-SOURCE MODEL

Page 21: Conditional Reasoning

predicted by that model, F(1.66, 19.99) � 6.10, p � .01. Therewas also an interaction of polarity and inference, F(1.59,19.06) � 6.86, p � .01, for the ratings predicted by thedual-source model. However, the Oaksford et al. model doesnot even approximately reproduce the effect of inference as isevident from Figure 7, and the main effect of inference was notsignificant under that model, F(1.76, 21.13) � 1.89, p � .18. Itis of course possible and even likely that the latter model willeventually fit the observed pattern as additional parameters areadded to it.

Prediction 3. The mean � parameters are also shown in Fig-ure 6 (see bottom right panel). As can be seen, they again followthe expected pattern with MP � MT � AC � DA. An analysis ofvariance of the � parameters with factor inference (MP, MT, AC,and DA) showed a significant main effect of inference, F(1.99,23.82) � 8.18, p � .01. Planned contrasts revealed that � for theMP inference exceeded the mean � for MT significantly, F(1,12) � 19.32, p � .01, whereas � for MT did not significantlyexceed the mean � for AC and DA (F � 1). The decrease from ACto DA was significant, F(1, 12) � 5.03, p � .045.6

Summary. The dual-source model again provided a satisfac-tory account of the data. It did so much more parsimoniously thanthe model by Oaksford et al. (2000). Despite similar overall levelsof goodness of fit, only the dual-source model, but not the modelby Oaksford et al., adequately reproduced the pattern of ratings asa function of polarity and inference (see Figure 7). Like in Oaks-ford et al.’s model, polarity bias is located in the knowledge-basedcomponent in the dual-source model: It reflects higher estimatesfor the probability of negated conclusions than of their affirmativecounterparts.

General Discussion

What precisely is the role of the conditional rule in probabilisticconditional reasoning? In one view, asserting a conditional ruleacts on the reasoner’s knowledge base by depressing the perceivedlikelihood of exceptions, that is, of cases in which the antecedentis fulfilled, but not the consequent (Oaksford & Chater, 2007,Chapter 5). The purpose of the present article is to develop and testan alternative view. In this view, the logical form of the problemis a decontextualized source of evidence that is integrated withknowledge-based evidence in assessing the probability of pro-posed conclusions. Without conditional rule, there is no completelogical form in the first place, and reasoners’ responses are thenbased on the evidence derived from background knowledge exclu-sively. Asserting a conditional rule in this view does not act on thereasoner’s knowledge base; instead, it makes available an addi-tional source of evidence, based on logical form rather than onknowledge about the presented contents.

This view was specified as a dual-source account of probabilis-tic conditional reasoning. According to the model, participantsintegrate two sources of evidence, logical form and prior knowl-edge. Form-based evidence comes into play to the extent to whicha rule is stated in the first place, to the extent to which its relevanceis emphasized, and to the extent to which a rule of the given formis seen as warranting a given inference, irrespective of content.Conversely, prior knowledge, operationalized summarily bymeans of bivariate probability information relating p and q, influ-ences probability ratings to the extent to which rule relevance is

downplayed and to the extent to which the given logical form isnot seen as warranting a given inference. When no rule is stated inthe first place, there is no complete logical form and the ratings arebased on background knowledge exclusively. As already men-tioned, the model can be framed as a normative model of Bayesianmodel averaging.

It is difficult to assess the role of the conditional rule in prob-abilistic conditional reasoning when there is no control conditionwithout conditional rule. For that reason, we always implementeda baseline condition in which problems were presented withoutrule. Across four experiments and in the reanalysis of Liu’s (2003)data, the model provided a relatively parsimonious account ofcomplex data patterns. What is more, the experimental manipula-tions mapped on the different model parameters in the expectedfashion. Thus, emphasizing the validity of the rule (Experiment 1)increased the relative weight given to rule-based evidence. Theknowledge parameters � reflected the manipulations of perceivedsufficiency and necessity (Experiments 1, 2, and 3). The � param-eters for the degree to which a given inference is seen as warrantedby logical form roughly followed a conditional pattern with MPand MT receiving higher values than AC and DA, although therewere individual differences, with some participants showing abiconditional pattern (all experiments). Using rules of the form“only–if” in addition led to a focused effect on the � parameters inthat �(MT) was increased for only–if relative to if–then (Experi-ment 3). Finally, the model accounted for observed polarity biasesin the negations paradigm parsimoniously (Experiment 4). A con-trol experiment (Experiment 2) defended two procedural choicesimplemented in the other experiments: (a) to obtain the baselineratings without rule first and (b) to permit participants to reeval-uate problems that they rated highly inconsistently.

Across these experiments, the dual-source model was success-fully evaluated (a) in terms of goodness of fit, (b) in terms ofwhether the effects of experimental manipulations targeted atspecific model parameters indeed affected these parameters asexpected, and (c) in terms of whether the model adequately repro-duced critical data patterns (polarity biases).

In the following sections, we compare the model parametersacross experiments and consider the relationship of the dual-sourcemodel to alternative accounts of probabilistic conditional reason-ing as well as to broader theories of reasoning.

6 We also fit a version of the dual-source model with different �parameters for each kind of rule (AA, AN, NA, and NN). This modelrequires 16 � parameters and a total of 29 parameters. Mean R2 was .87,which was significantly higher than mean R2 for Oaksford et al.’s (2000)model, t(12) � 2.64, p � .02, which required 28 parameters. This versionof our model allowed us to test whether negations in the rule exerted aneffect on the � parameters. In an analysis of variance with rule kind andinference as factors, neither the main effect of rule kind, F(2.59, 31.11) �1.46, p � .25, nor its interaction with inference, F(4.52, 54.29) � 1.00, p �.42, was significant. Thus, there is little evidence for substantial effects ofnegations in the rule on the � parameters. This agrees well with the findingthat the dual-source model with � parameters constant across rule kindadequately accounts for the polarity biases observed in the ratings as justreported.

318 KLAUER, BELLER, AND HUTTER

Page 22: Conditional Reasoning

Comparisons Between Experiments

In this section, we compare the dual-source model parametersacross experiments. This analysis can be seen as assessing theimpact of individual differences between the different samples ofparticipants and the impact of other contextual and proceduralvariables that differed between experiments.

� parameters. � parameters from all experiments other thanthose estimated for the only–if problems in Experiment 3 wereentered into an analysis of variance with factors inference (MP,MT, AC, and DA) and experiment. None of the effects involvingexperiment was significant (largest F � 1.67, smallest p � .18).Thus, the profile of � parameters was consistent across experi-ments. The parameter estimates are shown in Figure 8 as a function

of inference and experiment along with the mean � parametersover all participants. As can be seen, the overall means follow theexpected pattern with � values ordered approximately as MP �MT � AC � DA.

� parameters. We also compared parameters for the rela-tive weight of the form-based evidence across experimental con-ditions. We left out the parameters estimated for the phase withemphasis on the rule in Experiment 1, because in this condition aneffort had been made to alter the value of from that in thestandard conditions. An analysis of variance with experiment asfactor revealed significant differences between the experiments,F(3, 69) � 3.13, p � .03: tended to be smaller in Experiments1 and 2 (M � .63, SD � .37) than in Experiments 3 and 4 (M �

+

+

+

+

Pro

babi

lity

Rat

ing

Observed Data

6065

7075

8085

9095

MP MT AC DA

−−

* *

+

++

+

Pre

dict

ed P

roba

bilit

y R

atin

g

Dual−Source Model

6065

7075

8085

9095

MP MT AC DA

− −

* *

++

+

+

Pre

dict

ed P

roba

bilit

y R

atin

g

Oaksford et al. Model

6065

7075

8085

9095

MP MT AC DA

* *

Figure 7. Mean ratings as a function of inference (MP [modus ponens], MT modus [tollens], AC [affirmationof the consequent], and DA [denial of the antecedent]) and conclusion polarity (affirmative [ ] vs. negated [�]).The left panel shows the observed ratings; the middle and right panels show the model predictions under thedual-source model and Oaksford et al.’s (2000) model, respectively. Asterisks mark inferences with significantpolarity effects (at the 5% level of significance).

319A DUAL-SOURCE MODEL

Page 23: Conditional Reasoning

.85, SD � .24). It is difficult to pinpoint the cause of this differ-ence. One difference between the two groups of experiments is thatin Experiments 1 and 2, there was only one rule per content,whereas several rules were used with each content in Experiments3 and 4. This may have made the rules more salient in the latterexperiments, but it is also possible that the differences reflectindividual differences between the individuals sampled for thedifferent experiments (see next section).

Knowledge parameters. In Experiments 1, 2, and 3 the samecontents were used with p and q interchanged for the HL and LHrules in Experiment 3. Taking that change into account, we sub-mitted the � parameters to an analysis of variance with the within-participants factors content and inference and the between-participants factor experiment. This revealed a three-wayinteraction of all three factors, F(9.85, 280.58) � 2.21, p � .02.

To explore the interaction, we conducted separate analyses foreach content. This revealed no significant effects or interactionsinvolving the factor experiment for the HH, HL, and LH contents(largest F � 2.96, smallest p � .06). The factor experiment had amore pronounced impact on the LL content, where it interactedsignificantly with inference, F(4.08, 116.25) � 6.24, p � .01, andexerted a main effect, F(2, 57) � 3.19, p � .049.

Thus, differences between the experiments in the knowledgeparameters are largely confined to the LL content (“If a persondrinks lots of Coke, then that person will gain weight”). Thismakes intuitive sense: Knowledge about the HH (“If a predator ishungry, it will search for prey”), HL (“If a balloon is pricked with

a needle, then it will pop”), and LH (“If a girl has sexual inter-course, then she will be pregnant”) contents is likely to be sharedto a greater extent than that for the LL content, which intuitivelyleaves more room for subjective assessments.

Relationship to Other Accounts

Second-order conditionalization. In the introduction, we dis-cussed the relationship of the dual-source model to the concept ofsecond-order conditionalization by Liu (2003). One way to look atthe dual-source model is to say that it extends Liu’s concept ofsecond-order conditionalization: It provides an explicit model ofthe second-order conditional probabilities, and it relaxes the as-sumption implicit in that concept that reliance on the rule is perfectif a rule is given and the second-order conditional probabilities canbe computed. The dual-source model thus admits that the majorpremise of a conditional syllogism, the rule, may be uncertain.Like most approaches in the field, it still considers the given minorpremise as certain (but see Over & Hadjichristidis, 2009).

Oaksford et al.’s (2000) model. An alternative, probabilisticmodel of conditional inference was presented by Oaksford et al.The dual-source model provided better descriptions of the datathan the Oaksford et al. model with the exception of Experiment 2,in which both models performed equivalently (as expected). Thedual-source model achieved significantly better goodness-of-fitvalues in Experiments 1 and 3. In Experiment 4, it gave a muchmore parsimonious description of the data, with only about half as

Par

amet

er E

stim

ates

(in

Per

cent

)

Estimates of Parameter τ

3040

5060

7080

9010

0

MP MT AC DA

Exp. 1Exp. 2Exp. 3 (if−then)Exp. 4Mean

Figure 8. Mean � parameters as a function of inference (MP [modus ponens], MT modus [tollens], AC[affirmation of the consequent], and DA [denial of the antecedent]) and experiment, along with the overall meansacross all participants.

320 KLAUER, BELLER, AND HUTTER

Page 24: Conditional Reasoning

many parameters as the Oaksford et al. model and with an equiv-alent level of goodness of fit. In addition, the dual-source modelprovided the more adequate account of the effect pattern of con-clusion polarity and inference in Experiment 4 (see Figure 7).

Another advantage of the dual-source model is conceptual:Manipulations of the relevant knowledge via the used contents(Experiments 1 to 3), manipulations of logical form (Experiment3), and manipulations of rule relevance via instructions (Experi-ment 1) all mapped on different model parameters, parametersintended to capture processes assumed to be differentially sensitiveto these qualitatively different manipulations on theoreticalgrounds. In contrast, all parameters of Oaksford et al.’s (2000)model summarize the reasoner’s relevant knowledge, and thus, thedifferent experimental manipulations do not map cleanly on sep-arable parameters. For example, the exceptions parameters inOaksford et al.’s model were sensitive to (a) the particular contentused, (b) the presence or absence of a rule, (c) the kind of ruleused, and (d) the emphasis that instructions placed on rule rele-vance. As a side effect, a new exceptions parameter has to beestimated for each rule and content combination in Oaksford etal.’s model, causing the model to be much less parsimonious whenthere are many such combinations as in the negations paradigm.

Nevertheless, we believe that it is premature to reject Oaksfordet al.’s (2000) model altogether at this point because in absoluteterms the model’s disadvantage in goodness of fit was small andbecause of its appealing conceptual simplicity: It requires only onesource of information, namely the bivariate probability distributionof p and q that is assumed to be altered by the presence of a rule.It is possible that modifications and adaptations of that model canbe found that are more successful than the modifications that wetried out to remove the above problems. We believe, however, thatat this stage it is fair that the burden of proof of this possibilityshould reside with the proponents of that model.

If we do not claim to have refuted Oaksford et al.’s (2000)model decisively at this point, we do believe to have established aviable alternative to it in terms of the dual-source model. Itprovided better descriptions of the data with fewer parameters; likeOaksford et al.’s model, it can be given a normative interpretation;and it is capable of dealing with rule forms and connectives otherthan if–then without modifications. For example, it can be usedwith “only–if,” with “or,” and other connectives simply by esti-mating new values for the � parameters for the certainty withwhich the resulting logical forms are seen to warrant the studiedinferences.

Verschueren et al.’s (2005a) dual-process model. Vers-chueren et al. (2005a, 2005b) have proposed a dual-process modelof conditional reasoning in knowledge-rich contexts. These au-thors make a distinction between perceived sufficiency and neces-sity on the one hand and counterexample information in the formof alternative antecedents and disabling conditions on the otherhand (but see Geiger & Oberauer, 2007). They argued that condi-tional reasoning can recruit both a fast and relatively undemandingheuristic process as well as an analytical process that takes moretime and imposes a larger load on working-memory resources.Importantly, the heuristic process draws on perceived necessityand sufficiency, whereas the analytical process relies on counter-example information. From the present point of view, both pro-cesses are thereby grounded in the knowledge-based mode ofreasoning. What Verschueren et al. (2005a, 2005b) showed is that

the knowledge-based mode of reasoning may itself recruit severaldissociable processes that differ in speed and working-memorydemands (but see Geiger & Oberauer, 2007). As a consequence,we would predict that the effects by Verschueren et al. (2005a,2005b) would be obtained even if the conditional rule is omittedfrom all problems and questions, so that participants must rely onbackground knowledge about the presented contents in the absenceof complete logical forms.

Suppositional theory of “if.” The suppositional theory of “if”(Evans & Over, 2004, Chapter 8) received partial support. Accord-ing to that theory, if–then leads reasoners to focus on p cases. Theprobability of the conditional as well as the confidence in assertingq given p is then derived in a process that is sensitive to the relativefrequency of cases with p and q versus cases with p and not-q. Asa result, both the confidence in the conditional rule as well as inMP are primarily dependent upon perceived sufficiency P(q�p). Inthe present case, an estimate of P(q�p) was given by the ratings forMP problems without rule. In line with the suppositional account,perceived sufficiency predicted the rank orders of rated rule be-lievability as well as MP and MT ratings for problems with rule inall of our experiments.

On the other hand, the effect of adding a rule is in generalinversely related to the ratings of rule believability and MP ratingswithout rule assessing perceived sufficiency. For example, inExperiment 1, MP problems without rule received ratings of, inorder, 88, 94, 34, and 61 for the HH, HL, LH, and LL contents;rule believability was rated similarly as, in order, 89, 92, 39, and56; yet, adding a rule increased MP ratings in Phase 2 of Exper-iment 1 by, in order, 7, 3, 46, and 24 points on the percent scale.

Some of this inverse relationship is undoubtedly due to ceilingeffects: If rule believability and MP rating without rule are alreadyclose to 100, there is little room for further increases. However, theinverse relationship also holds for LH and LL contents, which startout at low to intermediate levels of believability (for ratingswithout rule) and end well below the ceiling (for ratings with rule;see for example contents LH and LL in Experiment 1). Thissomewhat paradoxical inverse relationship between (a) the size ofrule effects on MP ratings and (b) MP ratings in the baseline phasewithout rule directly follows from the dual-source model: It is aconsequence of the fact that the knowledge-based componententers the MP ratings with rule with a weight factor smaller thanone, leading to a compression of the differences between contentsrelative to the baseline phase that is exclusively driven by theknowledge-based component.

It is difficult to see at first glance how the suppositional accountof conditional reasoning (Evans & Over, 2004) would deal withthis dissociation: The MP problem without rule, like the rule itself,should focus participants on cases with p. Thus, it should not makemuch of a difference whether a rule is stated for MP ratings. Thisexpectation is underlined by the surprisingly close correspondencebetween ratings for MP problems without rule and ratings of rulebelievability in our data. Yet, MP ratings with rule differed quitestrongly from ratings of rule believability and ratings of MPproblems without rule.

One additional aspect of the suppositional account is, however,that it is embedded in a dual-system framework. According todual-system theories (for a review see Frankish & Evans, 2009),conditional inference integrates the outcomes of two distinct sys-tems, one system being characterized as unconscious, rapid, auto-

321A DUAL-SOURCE MODEL

Page 25: Conditional Reasoning

matic, high capacity, and contextualized, and the second as con-scious, slow, effortful, deliberative, and decontextualized. It maybe that the presence of the rule triggers the second system, thoughtto be capable of logical inferences, to a larger extent than whenproblems are presented without rule. From this perspective, thedual-source model can be seen as a specification of the generalidea that two systems may be involved in probabilistic conditionalreasoning.

Dual-process theories in general. There is indeed an obviousrelationship between the dual-source model and dual-system/dual-process theories in terms of the feature “contextualized” versus“decontextualized.” Rule-based evidence is conceived of as de-contextualized in the dual-source model, depending only upon ruleform irrespective of content. Content-based evidence is by defini-tion contextualized and domain-specific. We hesitate, however, totake a firm stance with respect to ascribing the other attributes suchas automaticity, efficiency, speed, and so forth to one of the twosources of evidence and the processes recruited to process them.As already mentioned, we believe that both modes comprise sev-eral dissociable processes that differ on many of the just-mentioned attributes within each mode (see also Evans, 2009). Forexample, we argued above that Verschueren et al. (2005a, 2005b)have proposed that the knowledge-based mode can draw on at leasttwo processes, one heuristic (fast and relatively efficient) andanother one more analytical (slow and less efficient; see alsoBeller and Spada, 2003).

In conclusion, the success of the present dual-source model inaccounting for complex patterns of data in a psychologicallymeaningful way suggests that it may be premature to abandon theidea that there is an abstract, decontextualized representation ofconditional rules operating in probabilistic conditional inference.This runs counter to the dominant knowledge-based view of prob-abilistic conditional reasoning as reviewed in the introduction, butit creates a link to previous work on reasoning (albeit outside thedomain of specifically conditional reasoning) that demonstratedthat different modes of reasoning can be elicited by differentinstructions, a form-based mode by deductive instructions, and aknowledge-based mode by inductive instructions (Heit & Rotello,2005, 2008; Rips, 2001; Rotello & Heit, 2009). The present workalso extends this latter line of research by showing that even underpurely inductive instructions, both modes of reasoning are re-cruited, determining responses jointly as specified by the dual-source model. A rule when present raises one’s subjective confi-dence in certain inferences irrespective of content, whereasprobabilistic prior knowledge about the content domain understudy comes into play to the extent to which the rule-basedevidence is weak.

References

Beller, S. (2008). Deontic norms, deontic reasoning, and deontic condi-tionals. Thinking & Reasoning, 14, 305–341.

Beller, S., & Kuhnmunch, G. (2007). What causal conditional reasoningtells us about people’s understanding of causality. Thinking & Reason-ing, 13, 426–460.

Beller, S., & Spada, H. (2003). The logic of content effects in propositionalreasoning: The case of conditional reasoning with a point of view.Thinking & Reasoning, 9, 335–379.

Braine, M. D. S. (1978). On the relation between the natural logic ofreasoning and standard logic. Psychological Review, 85, 1–21.

Cummins, D. D. (1995). Naive theories and causal deduction. Memory &Cognition, 23, 646–658.

Cummins, D. D., Lubart, T., Alksnis, O., & Rist, R. (1991). Conditionalreasoning and causation. Memory & Cognition, 19, 274–282.

De Neys, W., Schaeken, W., & d’Ydewalle, G. (2003). Inference suppres-sion and semantic memory retrieval: Every counterexample counts.Memory & Cognition, 41, 581–595.

Evans, J. St. B. T. (1993). The mental model theory of conditional rea-soning: Critical appraisal and revision. Cognition, 48, 1–20.

Evans, J. St. B. T. (2009). How many dual-process theories do we need?One, two, or many? In J. St. B. T. Evans & K. Frankish (Eds.), In twominds: Dual processes and beyond (pp. 33–54). New York, NY: OxfordUniversity Press.

Evans, J. St. B. T., Handley, S. J., & Over, D. E. (2003). Conditionals andconditional probability. Journal of Experimental Psychology: Learning,Memory, and Cognition, 29, 321–355.

Evans, J. St. B. T., & Lynch, J. S. (1973). Matching bias in the selectiontask. British Journal of Psychology, 64, 391–397.

Evans, J. St. B. T., Newstead, S. E., & Byrne, R. M. J. (1993). Humanreasoning. Hillsdale, NJ: Erlbaum.

Evans, J. St. B. T., & Over, D. E. (2004). If. Oxford, England: OxfordUniversity Press.

Frankish, K., & Evans, J. St. B. T. (2009). Systems and levels: Dual-systemtheories and the personal–subpersonal distinction. In J. St. B. T. Evans& K. Frankish (Eds.), In two minds: Dual processes and beyond (pp.89–107). New York, NY: Oxford University Press.

Geiger, S. M., & Oberauer, K. (2007). Reasoning with conditionals: Doesevery counterexample count? It’s frequency that counts. Memory &Cognition, 35, 2060–2074.

George, C. (1995). The endorsement of the premises: Assumption-based orbelief-based reasoning. British Journal of Psychology, 86, 93–111.

Handley, S., & Feeney, A. (2003). Representation, pragmatics and processin model-based reasoning. In W. Schaeken, A. Vandierendonck, W.Schroyens, & G. d’Ydewalle (Eds.), The mental models theory of rea-soning: Refinements and extensions (pp. 25–49). Hillsdale, NJ: Erl-baum.

Heit, E., & Rotello, C. M. (2005). Are there two kinds of reasoning? InB. G. Bara, L. Barsalou, & M. Bucciarelli (Eds.), Proceedings of the27th annual meeting of the Cognitive Science Society (pp. 923–928).Mahwah, NJ: Erlbaum.

Heit, E., & Rotello, C. M. (2008). Modeling two kinds of reasoning. InB. C. Love, K. McRae, & V. M. Sloutsky (Eds.), Proceedings of the 30thannual meeting of the Cognitive Science Society (pp. 1831–1836). Aus-tin, TX: Cognitive Science Society.

Johnson-Laird, P. N., Byrne, R. M. J., & Schaeken, W. (1992). Proposi-tional reasoning by model. Psychological Review, 99, 418–439.

Liu, I. (2003). Conditional reasoning and conditionalization. Journal ofExperimental Psychology: Learning, Memory, and Cognition, 29, 694–709.

Liu, I., Lo, K., & Wu, J. (1996). A probabilistic interpretation of “if–then.”The Quarterly Journal of Experimental Psychology, 49, 828–844.

Markovits, H., & Handley, S. (2005). Is inferential reasoning just proba-bilistic reasoning in disguise? Memory & Cognition, 33, 1315–1323.

Markovits, H., & Thompson, V. (2008). Different developmental patternsof simple deductive and probabilistic inferential reasoning. Memory &Cognition, 36, 1066–1078.

Matarazzo, O., & Baldassarre, I. (2008). Probability and instruction effectsin syllogistic conditional reasoning. Proceedings of World Academy ofScience, Engineering and Technology, 33, 427–435.

Myung, I. J. (2000). The importance of complexity in model selection.Journal of Mathematical Psychology, 44, 190–204.

Oaksford, M., & Chater, N. (2007). Bayesian rationality. Oxford, England:Oxford University Press.

Oaksford, M., Chater, N., & Larkin, J. (2000). Probabilities and polarity

322 KLAUER, BELLER, AND HUTTER

Page 26: Conditional Reasoning

biases in conditional inference. Journal of Experimental Psychology:Learning, Memory, and Cognition, 26, 883–899.

Oberauer, K., & Wilhelm, O. (2003). The meaning(s) of conditionals—Conditional probabilities, mental models, and personal utilities. Journalof Experimental Psychology: Learning, Memory, and Cognition, 29,680–693.

O’Brien, D. P., Dias, M. G., & Roazzi, A. (1998). A case study in themental models and mental-logic debate: Conditional syllogisms. In D. S.Braine & D. P. O’Brien (Eds.), Mental logic (pp. 385–420). London,England: Erlbaum.

O’Hagan, A., & Forster, J. (2004). Kendall’s advanced theory of statistics:Volume 2B: Bayesian inference. London, England: Arnold.

Over, D. E., & Hadjichristidis, C. (2009). Uncertain premises and Jeffrey’srule. Behavioral and Brain Sciences, 32, 97–98.

Rips, L. J. (1994). The psychology of proof. Cambridge, MA: MIT Press.Rips, L. J. (2001). Two kinds of reasoning. Psychological Science, 12,

129–134.Rotello, C. M., & Heit, E. (2009). Modeling the effects of argument length

and validity on inductive and deductive reasoning. Journal of Experi-mental Psychology: Learning, Memory, and Cognition, 35, 1317–1330.

Rouder, J. N., Lu, J. D., Morey, R., Sun, D., & Speckman, P. L. (2008). Ahierarchical process-dissociation model. Journal of Experimental Psy-chology: General, 137, 370–398.

Schroyens, W., Schaeken, W., & d’Ydewalle, G. (2001). A meta-analytic

review of conditional reasoning by model and/or rule: Mental modelstheory revised. Unpublished manuscript, University of Leuven, Bel-gium.

Stevenson, R. J., & Over, D. E. (2001). Reasoning from uncertain pre-mises: Effects of expertise and conversational context. Thinking &Reasoning, 7, 367–390.

Thompson, V. A. (1994). Interpretational factors in conditional reasoning.Memory & Cognition, 22, 742–758.

Thompson, V. A., & Mann, J. M. (1995). Perceived necessity explains thedissociation between logic and meaning: The case of “only if.” Journalof Experimental Psychology: Learning, Memory, and Cognition, 21,1554–1567.

Verschueren, N., Schaeken, W., & d’Ydewalle, G. (2005a). A dual-processspecification of causal conditional reasoning. Thinking & Reasoning, 11,239–278.

Verschueren, N., Schaeken, W., & d’Ydewalle, G. (2005b). Everydayconditional reasoning: A working memory-dependent tradeoff be-tween counterexample and likelihood use. Memory & Cognition, 33,107–119.

Received February 25, 2009Revision received December 4, 2009

Accepted December 7, 2009 �

323A DUAL-SOURCE MODEL

Page 27: Conditional Reasoning

Correction to Klauer et al. (2010)

In the article “Conditional Reasoning in Context: A Dual-Source Model of Probabilistic Infer-ence,” by Karl Christoph Klauer, Sieghard Beller, and Mandy Hutter (Journal of ExperimentalPsychology: Learning Memory, and Cognition, 2010, Vol. 36, No. 2, pp. 298–323), the dual-sourcemodel is overparameterized. Only the products � of the and � parameters are uniquely identifiedby the data. This has no consequences for the � parameters, for ratios of � parameters estimated withthe same , for ratios of parameters associated with the same � parameters, nor for the fit values.The model fit is, however, achieved more parsimoniously than stated in Klauer et al. because oneparameter (Experiments 1, 2, and 4) or two parameters (Experiment 3) are redundant.

To fix the scale for � and parameters, one of them has to be set to one. We recommend to setthe largest of �(MP), �(MT), �(AC), and �(DA) equal to one. This yields unique parameter estimatesfor � and but has consequences for their interpretation: Differences in overall level of the profileof � parameters over the four inferences (due to, e.g., differences in cognitive load), if any, wouldbe removed from the � estimates and would show up in the parameters. The above constraint isthe one implicitly imposed almost perfectly by the estimation method used in Klauer et al. (2010).In consequence, when the constraint is explicitly enforced, the numerical values of the parameterestimates reported in Klauer et al. change only minimally, and the outcome of all of the significancetests reported remains the same.