Lawson's Classroom Test Of Scientific Reasoning

LCTSR

OVERVIEW

Overview

Listed below is general information about the instrument.

Summary
Original author(s)	Lawson, Clark, B., Cramer-Meldrum, E., Falconer, K. A., Sequist, J. M., & Kwon, Y.-J.
Original publication	Lawson, Clark, B., Cramer-Meldrum, E., Falconer, K. A., Sequist, J. M., & Kwon, Y.-J. (2000). Development of Scientific Reasoning in College Biology: Do Two Levels of General Hypothesis-Testing Skills Exist? Journal of Research in Science Teaching, 37(1),
Year original instrument was published	2000
Inventory
Number of items	11
Number of versions/translations	3
Cited implementations	12
Language	English
Country	United States
Format	Multiple Choice Open Ended
Intended population(s)	Students Undergraduate Teachers Secondary School
Domain	Cognitive
Topic	Problem Solving Scientific Reasoning

EVIDENCE

Evidence

The CHIRAL team carefully combs through every reference that cites this instrument and pulls all evidence that relates to the instruments’ validity and reliability. These data are presented in the following table that simply notes the presence or absence of evidence related to that concept, but does not indicate the quality of that evidence. Similarly, if evidence is lacking, that does not necessarily mean the instrument is “less valid,” just that it wasn’t presented in literature. Learn more about this process by viewing the CHIRAL Process and consult the instrument’s Review (next tab), if available, for better insights into the usability of this instrument.

Information in the table is given in four different categories:

General - information about how each article used the instrument:
- Original development paper - indicates whether in which paper(s) the instrument was developed initially
- Uses the instrument in data collection - indicates whether an article administered the instrument and collected responses
- Modified version of existing instrument - indicates whether an article has modified a prior version of this instrument
- Evaluation of existing instrument - indicates whether an article explicitly provides evidence that attempt to evaluate the performance of the instrument; lack of a checkmark here implies an article that administered the instrument but did not evaluate the instrument itself
Reliability - information about the evidence presented to establish reliability of data generated by the instrument; please see the Glossary for term definitions
Validity - information about the evidence presented to establish reliability of data generated by the instrument; please see the Glossary for term definitions
Other Information - information that may or may not directly relate to the evidence for validity and reliability, but are commonly reported when evaluating instruments; please see the Glossary for term definitions

Publications:	1	2	3	4	5	6	7	8	9	10	11	12
General
Original development paper	✔
Uses the instrument in data collection	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔	✔
Modified version of existing instrument
Evaluation of existing instrument	✔			✔	✔		✔	✔	✔	✔	✔	✔
Reliability
Test-retest reliability							✔
Internal consistency
Coefficient (Cronbach's) alpha
McDonald's Omega
Inter-rater reliability
Person separation
Generalizability coefficients
Other reliability evidence
Validity
Expert judgment											✔
Response process											✔
Factor analysis, IRT, Rasch analysis										✔		✔
Differential item function
Evidence based on relationships to other variables	✔		✔	✔	✔			✔	✔	✔	✔	✔
Evidence based on consequences of testing
Other validity evidence
Other information
Difficulty
Discrimination											✔
Evidence based on fairness
Other general evidence

REVIEW

Review

DISCLAIMER: The evidence supporting the validity and reliability of the data summarized below is for use of this assessment instrument within the reported settings and populations. The continued collection and evaluation of validity and reliability evidence, in both similar and dissimilar contexts, is encouraged and will support the chemistry education community’s ongoing understanding of this instrument and its limitations.
This review was generated by a CHIRAL review panel. Each CHIRAL review panel consists of multiple experts who first individually review the citations of the assessment instrument listed on this page for evidence in support of the validity and reliability of the data generated by the instrument. Panels then meet to discuss the evidence and summarize their opinions in the review posted in this tab. These reviews summarize only the evidence that was discussed during the panel which may not represent all evidence available in the published literature or that which appears on the Evidence tab.
If you feel that evidence is missing from this review, or that something was documented in error, please use the CHIRAL Feedback page.

Panel Review: Lawson's Classroom Test Of Scientific Reasoning (LCTSR)

(Post last updated 09 June 2023)

Review panel summary
Lawson’s Classroom Test of Scientific Reasoning (LCTSR) was developed in 2000 by modifying the 1978 version of Lawson’s Classroom Test of Formal Reasoning (CTFR-78) [1,5]. The LCTSR has been used to measure scientific reasoning of students in undergraduate introductory biology [1,2,6,8,9], physics [3,5], astronomy [3], and chemistry [4,10] courses, as well as during a professional development workshop for secondary teachers [7]. The LCTSR is generally administered as a 24-item (12 question pair) multiple-choice test [2-10], although the original development paper states it was administered as a written test [1]. The items are associated with six different reasoning patterns: conservation, proportional reasoning, control of variables, probability, correlation reasoning, and hypothetical-deductive reasoning [1,5].The first 10 question pairs follow a common two-tier multiple-choice design where participants are first asked to select their response to a prompt and then asked to select an answer that supports that response [5]. The last two question pairs are designed slightly differently; the first asks participants to select an experimental design to test a hypothesis and then select an outcome that would disprove the hypothesis, while the last question pair asks participants to select outcomes that would disprove two hypotheses for an experiment [5]. These two question pairs were specifically developed to assess participants’ hypothetical-deductive reasoning [1,5]. The structure of the items has led to three different methods of scoring. The first is scoring each item individually for a maximum of 24 points [2,5,6,7]. The second is to use pair scoring, where participants have to answer both items in a question pair correctly to score 1 point. Pair scoring all 12 question pairs gives a maximum of 12 points [3,4,5,7]. Since the last question pair asks students about two different hypotheses, some studies score this last pair individually while scoring the other 11 question pairs pairwise, leading to a maximum of 13 points [1,8,10]. Some studies also grouped participants into reasoning levels based on their score [1,2,3,7,10]. For example, when scoring out of 13 total points, participants could be grouped as Level 0 (0-3 points), Low Level 1 (4-6 points), High Level 1 (7-10 points), or Level 2 (11-13 points) based on their score [1]. Similar reasoning levels created from LCTSR scores have been used to create collaborative groups that varied in the composition of students’ reasoning levels to assess the effect of different group compositions on reasoning gains in different classroom environments (e.g., inquiry vs. didactic) [2].

Evidence based on test content of data collected with the LCTSR is minimal. Expert feedback from seven science faculty reported concerns regarding five question pairs that showed a high amount of inconsistent student responses (i.e., where students answered one item in a question pair correctly but the other item incorrectly). These concerns were related to the multiple-choice options, design and presentation of the items, and complexity of some of the scenarios [5]. Evidence based on response process was evaluated through open-ended written explanations and interviews with students. Results further supported the expert concerns about the five question pairs with high inconsistent response patterns [5]. Evidence based on relation to other variables has been gathered using the LCTSR score as well as with reasoning levels. Pre to post score comparisons found that students’ scientific reasoning improved over the course of a semester of instruction in an introductory biology course [1]. Additionally, LCTSR scores were found to be positively related to exam scores [1], ACS exam scores [4], and course grades (i.e., success in the course) [8]. Significant correlations have been found between students’ LCTSR score and their normalized learning gains score from concept inventories in physics and astronomy courses [3], as well as a chemistry concept inventory and measures of intelligence and proportional reasoning ability [4]. When students were grouped into reasoning levels, it was found that students’ success on a transfer problem (i.e., an unrelated problem that assesses students’ reasoning) increased with higher reasoning levels [1]. Additionally, students in higher reasoning levels were also found to perform better on both algorithmic and conceptual questions, as well as on an ACS exam [10].

Evidence of single administration reliability was evaluated by calculating coefficient alpha for the entire instrument using the individual item scoring method (alpha = 0.85) and the pairwise scoring method (alpha = 0.86) [5]. Additionally, coefficient alpha was calculated for each question pair and found to range between 0.52 - 0.97 [5]. The difficulty of most question pairs were found to be in the suggested range of 0.3 - 0.9, although three question pairs were found outside of this range and were deemed too easy. These same three question pairs also showed poor discrimination (< 0.3) [5]. The point biserial correlation coefficients for all question pairs were used to investigate the relation between the question pair scores and the total test score and were found to be above the suggested value of 0.2 for all question pairs [5].

Recommendations for use
The LCTSR has been administered in multiple different introductory courses in order to assess students’ scientific reasoning; however, the validity evidence of data collected with the LCTSR is limited. Evidence based on relation to other variables has been provided through relations of LCTSR scores and/or reasoning levels and other variables such as, scores on a transfer problem [1], exam scores [1,4,10], concept inventories [3,4], and course grade [8]. This supports the use of this measure to predict student achievement outcomes (i.e., participants that score higher on the LCTSR would be expected to score higher on content-based assessments).

The predecessor to the LCTSR, the CTFR-78, was developed with the aid of experts in Piaget’s developmental theory and student interviews to provide some support of test content and response process; however, there was no indication that the same type of expert and/or student feedback were sought when modifying and creating additional items for the LCTSR [1]. A use study conducted at a later time obtained expert feedback and student interviews of the LCTSR question pairs and found multiple concerns with at least five question pairs [5]. Therefore, although there is evidence that the CTFR-78 measures formal reasoning, there is no equivalent evidence presented specifically to support the LCTSR as a measure of scientific reasoning. Concerns with some of the question pairs indicates that users should proceed with caution with interpreting results.

Limited evidence based on internal structure of LCTSR data is provided. Although studies generally report an overall LCTSR score [1-10], with some also reporting subscale scores for the six reasoning patterns [3,5], no evidence to support the use of these composite scores is provided. There is also some concern about inconsistent student response patterns to the question pairs, which may increase uncertainties in participant scores and need to be considered before interpreting data collected with the LCTSR [5].

Given the limited support from validity evidence, future users are encouraged to provide evidence of test content, response process, and internal structure before interpreting results from data collected with the LCTSR.

Details from panel review
The multiple-choice 2000 version of the LCTSR is based on modifications to an earlier 1978 version (CTFR-78). The CTFR-78 showed evidence based on test content and response process, as it was developed using Piaget’s developmental theory with the aid of experts in Piagetian research and interview data from middle-school students [5]. Additionally, quantitative data collected with the CRFR-78 was analyzed with principal components analysis to gain some information about the internal structure [5]. However, the validity evidence presented to support the modified multiple-choice LCTSR is more limited. Regarding evidence based on test content, information about the process behind the selection and modification of CTFR-78 items is not included in the development paper [1]. Additionally, although two new items are explicitly included in the development paper, there are no details about how these items were created [1]. A later study obtained expert feedback about the LCSTR items [5]; however, the feedback was used to note concerns about some question pairs and was not used for the development of the items. Similarly, evidence based on response process related to the LCTSR is minimal, with a later study noting concerns with some question pairs based on student interviews about the items[5]. Regarding evidence based on internal structure, one study evaluated the dependencies within and between question pairs by comparing the correlations of the residuals’ variances (i.e., absolute value of Q3 from a Rasch analysis) [5]. They found that most items correlated most strongly with their associated item in a question pair; however, there were six question pairs where this was not the case, indicating possible validity concerns for data collected with these items[5]. Student response patterns to the items within each question pair were also examined for consistency. Results indicated that although seven question pairs showed evidence of good consistency (i.e., students responded to both items in the same question pair either correctly or incorrectly), five of the question pairs showed high levels of inconsistency(i.e., students responded correctly to one item in a question pair and incorrectly to the other item in the same pair) [5]. The relations between LCTSR results and related outcomes and measures have been evaluated in multiple studies [1,3,4,8,10] providing evidence based on relation to other variables.

Single administration reliability of both the individual and pairwise scoring methods were evaluated by coefficient alpha, which initially gave 0.85 and 0.76, respectfully [5]. As the individual and pairwise scoring method result in different test lengths (i.e., 24 points vs. 12 points), the alpha value for the pairwise scoring method was adjusted using the Spearman-Brown prophecy formula, which gave an alpha value of 0.86 [5]. The sample-to-sample reliability of the measure was assessed in another study, which found that the mean and standard deviation of the results obtained with the LCTSR from one year to the next year were very similar [4]. Additionally, a correlation value of LCTSR scores from pretest to posttest was found to be 0.65, which could be considered as evidence for test-retest reliability [1]. However, as there was instruction between the pretest and posttest, it is not clear that the results from both administrations is a good indicator of test-retest reliability, as students’ reasoning skills may be expected to change during this time.

References

[1] Lawson, A.E., Clark, B., Cramer-Meldrum, E., Falconer, K.A., Sequist, J.M., & Kwon, Y.J. (2000). Development of Scientific Reasoning in College Biology: Do Two Levels of General Hypothesis-Testing Skills Exist? J. Res. Sci. Teach. 37, 81-101.

[2] Jensen, J. L. & Lawson, A. (2011). Effects of collaborative group composition and inquiry instruction on reasoning gains and achievement in undergraduate biology. CBE LSE 10(1), 63-73.

[3] Moore, J. C. & Rubbo, L. J. (2012). Scientific reasoning abilities of nonscience majors in physics-based courses. Phys. Rev. Phys. Educ, 8, 010106.

[4] Cracolice, M. S. & Busby, B. D. (2015). Preparation for College General Chemistry: More than Just a Matter of Content Knowledge Acquisition. J. Chem. Educ., 92(11), 1790-1797.

[5] Bao, L., Xiao, Y., Koenig, K. & Han, J. (2018). Validity evaluation of the Lawson classroom test of scientific reasoning. Phys. Rev. Phys. Educ., 14, 020106.

[6] Jensen, J. L., Holt, E. A., Sowards, J. B., Heath Ogden, T. & West, R. E. (2018). Investigating Strategies for Pre-Class Content Learning in a Flipped Classroom. J. Sci. Educ. Tech., 27, 523-535.

[7] Stammen, A. N., Malone, K. L. & Irving, K. E. (2018). Effects of Modeling Instruction Professional Development on Biology Teachers’ Scientific Reasoning Skills. Educ. Sciences 8(3), 119.

[8] Thompson, E. D., Bowling, B. V. & Markle, R. E. (2018). Predicting Student Success in a Major’s Introductory Biology Course via Logistic Regression Analysis of Scientific Reasoning Ability and Mathematics Scores. Res. Sci. Educ. 48, 151-163.

[9] Jensen, J. L., McDaniel, M. A., Kummer, T. A., Godoy, P. D. D. M. & St. Clair, B. (2020). Testing Effect on High-Level Cognitive Skills. CBE LSE 19:ar39.

[10] Cracolice, M. S., Deming, J. C. & Ehlert, B. (2008). Concept Learning versus Problem Solving: A Cognitive Difference. J. Chem. Educ. 85(6), 873-878.

VERSIONS

Versions

This instrument has not been modified nor was it created based on an existing instrument.

CITATIONS

Citations

Listed below are all literature that develop, implement, modify, or reference the instrument.

Lawson, Clark, B., Cramer-Meldrum, E., Falconer, K. A., Sequist, J. M., & Kwon, Y.-J. (2000). Development of Scientific Reasoning in College Biology: Do Two Levels of General Hypothesis-Testing Skills Exist? Journal of Research in Science Teaching, 37(1),
Moore, J.C., & Rubbo, L.J. (2012). Scientific reasoning abilities of nonscience majors in physics-based courses. Physical Review Special Topics - Physics Education Research, 8(1), .
Barker, J. G. Effect of instructional methodologies on student achievement modeling instruction vs. traditional

instruction (2012). Master's Thesis.
Thompson, E.D., Bowling, B.V., & Markle, R.E. (2018). Predicting Student Success in a Major’s Introductory Biology Course via Logistic Regression Analysis of Scientific Reasoning Ability and Mathematics Scores. Research in Science Education, 48(1), 151-16
Cracolice, M.S., Deming, J.C., & Ehlert, B. (2008). Concept learning versus problem solving: A cognitive difference. Journal of Chemical Education, 85(6), 873-878.
Jensen, J.L., Holt, E.A., Sowards, J.B., Heath, Ogden T., & West, R.E. (2018). Investigating Strategies for Pre-Class Content Learning in a Flipped Classroom. Journal of Science Education and Technology, 27(6), 523-535.
Feldon, D. F., Timmerman, B. C., Stowe, K. A., & Showman, R. (2010). Translating expertise into effective instruction: The impacts of cognitive task analysis (CTA) on lab report quality and student retention in the biological sciences. Journal of research in science teaching, 47(10), 1165-1185.
Cracolice, M.S., & Busby, B.D. (2015). Preparation for College General Chemistry: More than Just a Matter of Content Knowledge Acquisition. Journal of Chemical Education, 92(11), 1790-1797.
Lee, Jensen J., & Lawson, A. (2011). Effects of collaborative group composition and Inquiry instruction on reasoning gains and Achievement in undergraduate biology. CBE Life Sciences Education, 10(1), 64-73.
Stammen, A.N., Malone, K.L., & Irving, K.E. (2018). Effects of modeling instruction professional development on biology teachers’ scientific reasoning skills. Education Sciences, 8(3), .
Bao, L., Xiao, Y., Koenig, K., & Han, J. (2018). Validity evaluation of the Lawson classroom test of scientific reasoning. Physical Review Physics Education Research, 14(2), .
Jensen, J.L., McDaniel, M.A., Kummer, T.A., Godoy, P.D.D.M., & Clair, B.S. (2020). Testing effect on high-level cognitive skills. CBE Life Sciences Education, 19(3), 44574.

Lawson's Classroom Test Of Scientific Reasoning

General

Reliability

Validity

Other information

Panel Review: Lawson's Classroom Test Of Scientific Reasoning (LCTSR)

CHIRAL

ChemEd X

NSF