Introduction to jMetrik
jMetrik is a free program written by J. Patrick Meyer of the Curry School of Education at the University of Virginia.
jMetrik performs analyses required for construction and utilization of psychological tests – both “right/wrong answer” tests and “Likert” scales.
Getting jMetrik
https://itemanalysis.com/
The version described here is 4.0.5.
The basic steps involved are . . .
1. Getting the data into jMetrik. This is a process that varies from program to program, as our experiences with R have shown.
2. Scoring the test. Telling jMetrik which digits/letters are the correct answers in a right/wrong test or the order of values in a Likert scale. This process is also one that varies from one program to the next. Bond & Fox Steps used syntax preceding the data in the data file for these specifications. jMetrik does it with pull-down menus.
3. Analyzing the test – items, persons, distributions, scores, etc.
Most of the examples shown here are those that we’ve already analyzed using the Bond & Fox Steps program. This will facilitate comparison of jMetrik with the BF program.
Following is an example of the Bond & Fox Chapter 5 BLOT data.
In these data, 1s represent correct responses, 0s represent incorrect responses.
A following example will illustrate how to treat raw multiple choice data consisting of “a”s, “b”s, “c”s and “d”s in which we’ll have to tell jMetrik which response is correct.
1, Entering data into jMetrik – the Bond & Fox Chapter 5 data
1A. Create a database.
A database is kind of like a specialized folder on the hard drive. Data files, output documents, etc are stored there.
Databases are automatically stored in the User folder on the main, C, hard drive on Windows machines. I used Edit -> Preferences to see where it is on my computer . . .
1B. Open the database using Manage -> Open Database.
Highlight the database name, then click Open.
1C. Import data – Manage à Import Data
Data files can be “.txt” files or “.csv” files with comma, tab, colon, or semicolon delimiters.
The first line of the file can contain variable names.
The appearance of the data in the Excel csv file that I’d created earlier.
The data after importing into jMetrik. Click on the name of the Table to get this display.
The items in this data set are those from Bond & Fox Chapter 5 – the BLOT test example.
2. Scoring the test – Transform à Basic Item Scoring
Test scoring involves telling the program which digits/letters represent correct responses for right/wrong answers or what the order of responses is for Likert scales. We did not do this when studying the Bond & Fox Steps program. The scoring was in the syntax preceding the data.
For this example, the 1s represent correct responses and the 0s represent incorrect responses.
Result of the item scoring . . . (obtained by clicking on the Variables tab à Refresh Data View)
Note – where did the 0s come from? Are they assumed or did the program peruse the data and discover that for all items, the only other value was a 1?
3. Analyses
3A. A “non-Rasch” item analysis: Analyze à Item Analysis
When you’re done, click on [Run].
The Item-Analysis Output if “All Response Options” in the above is checked.
ITEM ANALYSIS
p5520database1.BFCHAPTER5DATA1
April 6, 2016 09:31:07
======
Item Option (Score) Difficulty Std. Dev. Discrimin.
------
i1 Overall 0.8667 0.3411 0.3795
1.0(1.0) 0.8667 0.3411 0.3795
2.0(0.0) 0.0000 0.0000 NaN
i2 Overall 0.8600 0.3481 0.3547
1.0(1.0) 0.8600 0.3481 0.3547
2.0(0.0) 0.0000 0.0000 NaN
i3 Overall 0.6533 0.4775 0.4362
1.0(1.0) 0.6533 0.4775 0.4362
2.0(0.0) 0.0000 0.0000 NaN
i4 Overall 0.7733 0.4201 0.3979
1.0(1.0) 0.7733 0.4201 0.3979
2.0(0.0) 0.0000 0.0000 NaN
i5 Overall 0.8867 0.3181 0.3488
1.0(1.0) 0.8867 0.3181 0.3488
2.0(0.0) 0.0000 0.0000 NaN
i6 Overall 0.9667 0.1801 0.2065
1.0(1.0) 0.9667 0.1801 0.2065
2.0(0.0) 0.0000 0.0000 NaN
TEST LEVEL STATISTICS
======
Number of Items = 35
Number of Examinees = 150
Min = 5.0000
Max = 35.0000
Mean = 26.3333
Median = 27.0000
Standard Deviation = 6.3031
Interquartile Range = 8.0000
Skewness = -0.8874
Kurtosis = 0.5106
KR21 = 0.8605
======
Since the above takes up a lot of space to present the basic information,
I reran the analysis, NOT checking the All Response Options box.
The result is below.
The output of the “non-Rasch” item analysis of the BLOT data . . .
([All Response Options] not checked)
ITEM ANALYSIS
p5950cdatabase1.JMETRIKEXAMPLE1
March 28, 2015 13:22:35
======
Item Option (Score) Difficulty Std. Dev. Discrimin.
------
i1 Overall 0.8667 0.3411 0.3795
i2 Overall 0.8600 0.3481 0.3547
i3 Overall 0.6533 0.4775 0.4362
i4 Overall 0.7733 0.4201 0.3979
i5 Overall 0.8867 0.3181 0.3488
i6 Overall 0.9667 0.1801 0.2065
i7 Overall 0.8533 0.3550 0.3998
i8 Overall 0.6333 0.4835 0.5093
io9 Overall 0.7467 0.4364 0.3478
o10 Overall 0.8000 0.4013 0.4667
i11 Overall 0.7467 0.4364 0.3863
i12 Overall 0.9400 0.2383 0.5577
i13 Overall 0.6067 0.4901 0.2975
i14 Overall 0.8600 0.3481 0.2129
i15 Overall 0.6067 0.4901 0.4565
i16 Overall 0.8133 0.3909 0.2726
i17 Overall 0.7133 0.4537 0.5302
i18 Overall 0.7800 0.4156 0.4868
i19 Overall 0.7000 0.4598 0.4064
i20 Overall 0.8733 0.3337 0.4291
i21 Overall 0.3600 0.4816 0.1757
i22 Overall 0.8933 0.3097 0.3847
i23 Overall 0.7200 0.4505 0.3630
i24 Overall 0.7400 0.4401 0.4984
i25 Overall 0.6933 0.4627 0.3406
i26 Overall 0.6467 0.4796 0.5012
i27 Overall 0.8800 0.3261 0.4680
i28 Overall 0.4867 0.5015 0.3384
i29 Overall 0.8333 0.3739 0.4302
i30 Overall 0.5933 0.4929 0.2694
i31 Overall 0.7467 0.4364 0.3402
i32 Overall 0.5800 0.4952 0.4740
i33 Overall 0.8400 0.3678 0.2905
i34 Overall 0.8267 0.3798 0.3873
i35 Overall 0.8133 0.3909 0.4519
======
For example . . .
The easiest items are i6 and i12. The hardest item is i21.
The most discriminating items are i12 and i17.
The least discriminating item is i21.
Item discrimination is a characteristic we have not seen in the Rasch analyses conducted by Bond & Fox Steps. (There is a column in the Item Measure output called PTmea Corr which may represent this.) The summary output in SPSS’s RELIABILITY procedure gives the item~total correlation.
The author suggests that discrimination values should be between .3 and .7. Note that Discrimination is largest for middle-difficulty items.
Summary Statistics for the collection of items
TEST LEVEL STATISTICS
======
Number of Items = 35
Number of Examinees = 150
Min = 5.0000
Max = 35.0000
Mean = 26.3333
Median = 27.0000
Standard Deviation = 6.3031
Interquartile Range = 8.0000
Skewness = -0.8874
Kurtosis = 0.5106
KR21 = 0.8605
======
RELIABILIY ANALYSIS
======
Method Estimate 95% Conf. Int. SEM
------
Guttman's L2 0.8804 (0.8512, 0.9063) 2.1799
Coefficient Alpha 0.8758 (0.8455, 0.9027) 2.2211
Feldt-Gilmer 0.8783 (0.8485, 0.9046) 2.1993
Feldt-Brennan 0.8784 (0.8487, 0.9047) 2.1984
Raju's Beta 0.8758 (0.8455, 0.9027) 2.2211
======
Reliabilities if items deleted . . .
RELIABILIY IF ITEM DELTED
======
Item L2 Alpha F-G F-B Raju
------------------
i1 0.8775 0.8728 0.8754 0.8754 0.8728
i2 0.8779 0.8732 0.8757 0.8758 0.8732
i3 0.8762 0.8715 0.8740 0.8741 0.8715
i4 0.8771 0.8723 0.8749 0.8750 0.8723
i5 0.8780 0.8734 0.8758 0.8759 0.8734
i6 0.8796 0.8755 0.8775 0.8776 0.8755
i7 0.8771 0.8724 0.8749 0.8750 0.8724
i8 0.8743 0.8697 0.8720 0.8722 0.8697
io9 0.8781 0.8734 0.8759 0.8760 0.8734
o10 0.8758 0.8709 0.8735 0.8736 0.8709
i11 0.8774 0.8726 0.8752 0.8753 0.8726
i12 0.8761 0.8713 0.8738 0.8739 0.8713
i13 0.8796 0.8749 0.8775 0.8776 0.8749
i14 0.8801 0.8757 0.8779 0.8780 0.8757
i15 0.8758 0.8710 0.8735 0.8736 0.8710
i16 0.8794 0.8748 0.8773 0.8774 0.8748
i17 0.8740 0.8693 0.8717 0.8718 0.8693
i18 0.8753 0.8705 0.8730 0.8731 0.8705
i19 0.8769 0.8721 0.8747 0.8748 0.8721
i20 0.8768 0.8720 0.8746 0.8746 0.8720
i21 0.8821 0.8777 0.8801 0.8802 0.8777
i22 0.8775 0.8729 0.8754 0.8754 0.8729
i23 0.8778 0.8731 0.8757 0.8758 0.8731
i24 0.8748 0.8701 0.8726 0.8727 0.8701
i25 0.8784 0.8737 0.8762 0.8763 0.8737
i26 0.8746 0.8699 0.8723 0.8724 0.8699
i27 0.8762 0.8714 0.8740 0.8741 0.8714
i28 0.8786 0.8739 0.8765 0.8765 0.8739
i29 0.8765 0.8718 0.8743 0.8744 0.8718
i30 0.8803 0.8756 0.8782 0.8783 0.8756
i31 0.8783 0.8736 0.8761 0.8762 0.8736
i32 0.8752 0.8705 0.8730 0.8731 0.8705
i33 0.8790 0.8744 0.8768 0.8769 0.8744
i34 0.8771 0.8726 0.8750 0.8752 0.8726
i35 0.8759 0.8713 0.8738 0.8739 0.8713
------
L2: Guttman's lambda-2 Alpha: Coefficient alpha F-G: Feldt-Gilmer coefficient
F-B: Feldt-Brennan coefficient Raju: Raju's beta coefficient
3B. A Rasch item analysis: Analyze à Rasch Models (JMLE)
Note – Rasch analysis requires that the items be scored 0 or 1 for dichotomous items. Scoring them 1 or 2 will not work.
Ours happen to be scored that way, so we don’t have to transform them.
Below are the 3 dialog boxes that you need to interact with when performing Rasch analyses.
The only options I’ve chosen here that are not default are the two check boxes on the Person dialog.
When you’re done, click on the [Run] button.
The Rasch model output for items
RASCH ANALYSIS
p5520database1.BFCHAPTER5DATA1
April 6, 2016 09:47:51
FINAL JMLE ITEM STATISTICS
======
Item Difficulty Std. Error WMS Std. WMS UMS Std. UMS
------
i1 -0.79 0.26 0.98 -0.04 0.69 -0.79
i2 -0.72 0.26 1.01 0.12 0.75 -0.61
i3 0.76 0.20 0.98 -0.23 0.90 -0.55
i4 -0.00 0.22 1.00 0.03 0.88 -0.43
i5 -1.01 0.28 0.98 -0.06 0.76 -0.47
i6 -2.49 0.47 1.06 0.27 0.83 0.12
i7 -0.66 0.25 0.97 -0.11 0.65 -1.00
i8 0.88 0.19 0.91 -1.09 1.00 0.07
io9 0.18 0.21 1.07 0.66 0.97 -0.05
o10 -0.20 0.23 0.92 -0.66 0.68 -1.18
i11 0.18 0.21 1.02 0.25 0.96 -0.09
i12 -1.81 0.36 0.69 -1.14 0.24 -1.51
i13 1.03 0.19 1.16 1.99 1.32 2.02
i14 -0.72 0.26 1.15 0.96 1.32 0.90
i15 1.03 0.19 0.97 -0.41 0.84 -1.09
i16 -0.31 0.23 1.13 1.00 1.03 0.20
i17 0.40 0.20 0.87 -1.45 0.75 -1.33
i18 -0.05 0.22 0.90 -0.87 0.74 -1.04
i19 0.48 0.20 1.01 0.15 1.05 0.31
i20 -0.86 0.27 0.91 -0.47 0.81 -0.36
i21 2.40 0.20 1.27 2.64 1.75 3.73
i22 -1.09 0.29 0.91 -0.41 1.69 1.41
i23 0.36 0.21 1.06 0.66 0.92 -0.31
i24 0.23 0.21 0.89 -1.09 1.03 0.21
i25 0.52 0.20 1.07 0.81 1.26 1.34
i26 0.80 0.20 0.90 -1.28 0.75 -1.60
i27 -0.94 0.27 0.85 -0.84 0.62 -0.92
i28 1.68 0.19 1.12 1.42 1.23 1.70
i29 -0.47 0.24 0.94 -0.41 0.71 -0.88
i30 1.10 0.19 1.19 2.27 1.15 1.04
i31 0.18 0.21 1.07 0.70 1.55 2.15
i32 1.17 0.19 0.96 -0.51 0.85 -1.11
i33 -0.53 0.25 1.10 0.68 0.93 -0.09
i34 -0.42 0.24 1.00 0.05 0.79 -0.62
i35 -0.31 0.23 0.93 -0.54 0.73 -0.90
======
In the Rasch display of results, “Difficulty” is difficulty – bigger positive values are the most difficult.
WMS is the Infit measure displayed by Bond & Fox Steps.
UMS is the B&F Outfit measure.
Std. values are “Z” values for the WMS and UMS measures.
Comparison with item measures from the Bond & Fox program analysis of same data.
As the scatterplot shows, these estimates are virtually identical to the item measures from our analysis of the same data using Bond & Fox Steps. There is a God of Statistics watching over us all – jMetrik & BF.
(I should note, however, that in a trial run, I got results in which the item 29 measure from jMetrik was 8.54. That, clearly, was out of line.
jMetrik’s Score Table – Mile-marker total scores and Theta (Person ability) values.
I don’t recall this table in the B&F Steps output.
SCORE TABLE
======
Score Theta Std. Err
------
0.00 -5.22 1.85
1.00 -3.96 1.04
2.00 -3.20 0.75
3.00 -2.73 0.63
4.00 -2.37 0.56
5.00 -2.09 0.51
6.00 -1.84 0.48
7.00 -1.63 0.45
8.00 -1.43 0.43
9.00 -1.25 0.42
10.00 -1.08 0.40
11.00 -0.92 0.39
12.00 -0.77 0.39
13.00 -0.62 0.38
14.00 -0.48 0.38
15.00 -0.34 0.37
16.00 -0.20 0.37
17.00 -0.07 0.37
18.00 0.07 0.37
19.00 0.21 0.37
20.00 0.34 0.37
21.00 0.48 0.38
22.00 0.63 0.38
23.00 0.77 0.39
24.00 0.93 0.39
25.00 1.08 0.40
26.00 1.25 0.42
27.00 1.43 0.43
28.00 1.63 0.45
29.00 1.84 0.48
30.00 2.09 0.51
31.00 2.37 0.56
32.00 2.72 0.63
33.00 3.19 0.75
34.00 3.95 1.03
35.00 5.21 1.84
======
Summary Statistics from the analysis.
SCALE QUALITY STATISTICS
======
Statistic Items Persons
------
Observed Variance 0.9312 1.6900
Observed Std. Dev. 0.9650 1.3000
Mean Square Error 0.0585 0.3136
Root MSE 0.2419 0.5600
Adjusted Variance 0.8727 1.3764
Adjusted Std. Dev. 0.9342 1.1732
Separation Index 3.8623 2.0948
Number of Strata 5.4830 3.1264
Reliability 0.9372 0.8144
======
Person Estimates
The person estimates are put into the data table along with the item responses and other person-specific information.
( In V 4.0.0, you have to click on “Refresh Data View” to see stuff that’s been added .)
Here are the first few rows of the person information generated by this run
Sum and vsum are identical here. If there were missing data, they might be different.
Theta is the Rasch person measure.
WMS is Infit.
UMS is Outfit.
Stdwms and stdums are “Z” transforms of the wms and ums.
Relationship of Person estimates from jMetrik analysis and those from B&F Steps