Skip to main content

Panel Review: Enthalpy And Entropy In Dissolution & Precipitation Inventory (E2DPI)

(Post last updated 09 June 2023)

Review panel summary   
The Enthalpy and Entropy in Dissolution & Precipitation Inventory (E2DPI) is a 28-item multiple-choice concept inventory (including 4 two-tiered answer-reason items) designed to test student understanding of the roles of enthalpy and entropy in the processes of dissolution and precipitation based on thermodynamic forces [1]. E2DPI items focus on examining student knowledge about enthalpy, entropy, and spontaneity either 1) within the context of dissolution (12 items) or precipitation (9 items) or 2) without any specific context (7 items). Within the existing literature, the E2DPI has been administered in a variety of courses after instruction and assessment on dissolution, precipitation, and thermodynamics [1].

The development of the E2DPI was guided by information gained from semi-structured cognitive interviews with general chemistry II (GCII, n=19), physical chemistry (PC, n=7), and biophysical chemistry (BPC, n=6) students after they had received instruction on the relevant course material. Analysis of student interview data allowed the authors of the E2DPI to create item distracters based on reported misconceptions, providing validity evidence based on test content. After initial item development, the ‘pilot’ version of the E2DPI was administered in the Spring of 2018 at a medium-sized midwestern US institution to students in a PC (n=10) course and a BPC (n=43) course. Evidence based on response process was gathered through 7 cognitive interviews with PC and BPC students 2-3 weeks after the initial administration of the inventory. The purpose of these interviews was to see if students were selecting the correct answer for the correct reason. At this stage, items were revised if both conditions were unmet. The authors specifically revised some of the E2DPI items and/or distracters to address issues they observed during these interviews. The final revised version of the concept inventory was then administered at the end of the Spring 2018 semester to a GCII (n=383) course at the same institution. An additional 9 cognitive interviews were conducted with GCII students after this administration, and no further revisions to the instrument were deemed necessary. In the following academic year, the final version of the E2DPI was administered at a different large, research-intensive institution in a “post-organic general chemistry II” (PO-GCII, n=160) course.

The E2DPI authors reported evidence based on concurrent validity, which is an aspect of validity evidence based on relations to other variables. This evidence was provided by comparing the E2DPI scores of GCII students (n=383) to those of PC (n=10) and BPC students (n=43) (combined n=53). The authors of the E2DPI suggested that theoretically, students who have received more instruction should be able to perform better on the E2DPI than those who have received less instruction on the related topics. To test this theory, a two-tailed independent samples Mann-Whitney U test was conducted and showed a significant difference with a small effect size favoring the students in the upper-division chemistry courses (i.e., PC and BPC students).

The authors explored single administration reliability evidence by reporting coefficient alpha values for each of the course populations who completed the inventory (i.e., GCII, PO-GCII, PC/BPC). All reported coefficient alpha values were found to be above the commonly reported threshold of 0.7. That said, the authors suggested that these values should be “interpreted with caution” because “[t]he construction of items using student thinking as distracters can pose a threat to the underlying assumptions of these tests” [1].

In addition to evidence of validity and reliability, the authors also explored item difficulty and item discrimination. Item difficulty was first explored by running tests for normality of distribution using the Kolmogorov-Smirnov test. While the PO-GC II and PC/BPC data were normally distributed, the GCII sample was rightly skewed, indicating the inventory was difficult for most of these students. Additionally, item difficulty was calculated as the proportion of students who answered an item correctly, with higher difficulty values indicating easier items. Acceptable values for item difficulty typically range between 0.3 and 0.8. The authors of the E2DPI reported a wide range of item difficulties across all samples of students. It was noted that few items were easy for the PO-GC II students (2 items) and for the PC/BPC students (3 items). However, none of the items were easy for the GCII students. Additionally, PO-GC II students and the PC/BPC students found a limited number of items to be difficult (2 and 1 items, respectively), while GC II students found many items to be difficult (6 items). Additionally, item discrimination was determined by calculating the difference between the percentage of items answered correctly by the top 33% of the students and the percentage of items answered correctly by the bottom 33% of students. Item discrimination values greater than 0.3 are considered highly discriminating. For GCII, PO-GC II, and PC/BPC several items showed discrimination values below 0.3, but only four items (6,15,16,18) showed poor discrimination across all three courses. Additionally, Ferguson’s delta was calculated and acceptable values (above 0.9) were found for all three courses, indicating that students earned a range of multiple scores among the possible scores (although the authors cited a similar argument for cautious interpretation of Ferguson’s delta to that which was provided for coefficient alpha). Finally, point-biserial correlations between each item's scores and the overall scores on the instrument were calculated. The correlations were all equal to or greater than 0.2, which is considered to be acceptable.

Recommendations for use   
Validity and reliability collected by the developers of the E2DPI supports use of the concept inventory in general chemistry II, post-organic general chemistry II, and physical/biophysical chemistry courses after instruction on enthalpy and entropy to gauge student understanding of the processes of dissolution and precipitation based on thermodynamic forces. Additionally, E2DPI data has been used to demonstrate the theoretical idea that students with more instruction on this topic (e.g., GCII students) will likely perform better than students who have not had much instruction on this topic (e.g., PC/BPC students).

Item difficulty and item discrimination have been explored with students across course levels, and indicate that the E2DPI items produce a wide range of responses. While future users of the E2DPI should be aware that several items have shown poor discrimination within or across course levels, the developers of the concept inventory argue that these items are useful for users of the E2DPI “because higher performing students hold misconceptions that are detected by those items.”

Finally, E2DPI sub-scores have been calculated and compared for a variety of author-derived item groupings (by topic), which may be of interest to future E2DPI users. As these are not statistically derived groupings, and the rationale for placing each item into its respective grouping is not discussed in the literature, it is recommended that interested users review the items within each grouping to determine if they will be useful in understanding the data from their students.

Details from panel review   
Validity evidence based on relations to other variables has been provided for the data collected with the E2DPI by comparing the scores between students with little instruction (i.e., GCII students) on the topics of focus and students with deeper-level instruction (i.e., PC/BPC students). Students with more instruction should, in theory, perform better than students with less instruction on this concept inventory. The statistical analysis the authors’ used to compare the scores between the students in these courses was a two-tailed independent samples Mann-Whitney U test. The authors report U=5158.5 p<0.001,ղ2=0.078. These results suggest that there is a statistically significant difference between the two student groups with a small effect size.

The authors presented evidence of single administration reliability by reporting coefficient alpha values for all three courses. The values were all above the commonly accepted threshold of 0.7 (GCII ɑ=0.73; PC/BPC ɑ=0.79; and PO-GCII ɑ=0.72). Additionally, Ferguson's delta was used to assess item discrimination. The values were all above the 0.9 threshold (GCII 𝛅=0.96; PO-GCII 𝛅=0.98; PC/BPC 𝛅=0.93). In each of these cases (coefficient alpha and Ferguson's delta), the authors suggested that these values should be “interpreted with caution” because “[t]he construction of items using student thinking as distracters can pose a threat to the underlying assumptions of these tests” [1].

Finally, normality of distribution for E2DPI scores was measured with the Kolmogorov-Smirnov test. The responses for GCII students were not normally distributed, rather rightly skewed indicating items were likely difficult for these students (K-S 0.122, df: 383, p<0.001). However, the responses for PO-GCII (K-S 0.069, df: 160, p<0.124) and PC/BPC (K-S 0.060, df: 53, p<0.200) were normally distributed.

References

[1] Abell, T.N. & Bretz, S.L. Development of the Enthalpy and Entropy in Dissolution and Precipitation Inventory. J. Chem. Educ. 96(9), 1804 - 1812.