Frequently Asked Questions About MIT’s Subject Evaluation Pilot
Why are we piloting new subject evaluation questions?
It is essential to the educational mission of MIT to continuously improve the quality of its subject design and delivery over time. This can be accomplished by evaluating our teaching, reflecting on the results of that evaluation, and planning improvements for the next iteration of our teaching (see figure).
Although there are many ways to evaluate one’s own teaching, student evaluations are the most commonly used source of that data. However, student feedback is only as useful as the tools we use to measure it (d’Apllonia & Abrami, 1997; Medina et al., 2019). Student evaluations of a subject can be valid, reliable, and helpful for faculty/instructors “provided the instrument asks the right questions” (Benton & Ryallis, 2016, p. 7). Thus, the first part of the subject evaluation pilot is the trial adoption of a revised set of questions. The second part of the pilot involves providing time for students to complete the evaluations in class. Doing so will help to improve the response rates and welcome more student voices into the feedback process (Benton & Ryallis, 2016; Medina et al., 2019). In the last part of the proposed process, a teaching development specialist (TLL staff) will be available to help participating instructors and departments process and use student feedback (Marsh & Roche, 1997).
What are the limitations of the current subject evaluation questions (SEs)?
In the scholarly literature, a wealth of evidence shows that the use of student ratings of questions like those that currently appear on the standard MIT subject evaluations can be problematic. These issues include:
Students are not well-equipped to answer some of the questions posed, such as whether the instructor has sufficient disciplinary knowledge/expertise.
Student ratings from standard SE questions are not a valid indicator of teaching effectiveness or student learning. Research has shown no correlation between student ratings using traditional questions and student learning outcomes or teaching effectiveness (AAUP, 2016; Braga, et al., 2014; Carrell & West, 2010; DesLauriers, 2019; Hornstein & Law, 2017; Johnson, 2003; Uttl, et al., 2017).
Students’ ratings of a subject (based on standard SE questions) are predicted better by how satisfied they are with their grades than other factors (e.g., instructional quality; Kogan et al., 2022).
Ethnicity, race, physical attractiveness, and instructor’s age influence responses to traditional SE questions (Ambady & Rosenthal, 1993; Anderson & Miller, 1997; Arbuckle & Williams, 2003; Basow, 1995; Cramer & Alexitch, 2000; Linse, 2017; Reid, 2010; Wachtel, 1998; Worthington, 2002).
Overall, women are systematically rated lower than men on traditional SE questions because of their gender (Kreitzer & Sweet-Cushman, 2021).
For an excellent critique of standard, end-of-semester teaching evaluations see: Stark, P.B. & Freishtat, R. (2014) An Evaluation of Course Evaluations, ScienceOpen. Philip Stark is a faculty member in the Department of Statistics at UC Berkeley. See also this summary of Evaluating Student Evaluations of Teaching (Kreitzer & Sweet-Cushman, 2021) and this collection of articles from R.J. Kreitzer, UNC-Chapel Hill.
What kind of information will be collected?
The pilot questions ask students to consider their experiences in the subject, such as
The various aspects of teaching and learning within the learning environment (classroom, lab, design, or production space, etc.)
Their own engagement with the subject within and outside of the designated learning space, and
The impacts of specific instructor actions and approaches on their learning.
What are the questions on the pilot SE?
The questions for the pilot SE form were designed based on research for the design of course evaluations (e.g., Medina et al., 2019) and work from STEM Demonstration Projects sponsored by the Association of American Universities (AAU). The pilot questions are available here.
Can a department still add its own department-specific questions to the pilot set?
Yes, we acknowledge and respect the need to gather department-specific information from subject evaluations. Therefore, the process for adding department-specific questions will remain unchanged. Please keep in mind that the length of the evaluation may impact student response rates.
Can individual instructors have their own subject-specific questions added to the SE?
Yes! The process for adding subject-specific questions will remain unchanged.
What can faculty and instructors learn about their teaching from the pilot evaluations?
There are many potential benefits for instructors participating in the pilot SE process. The pilot SE questions are focused on scholarly teaching criteria and, as such, are hypothesized to provide more detailed and informative student ratings than the traditional SEs (McCarthy, 2012). The pilot SE form also has a more comprehensive set of questions than the traditional SE form:
Some questions focus on course design, which is as important as teacher’s performance for student perceptions of quality and engagement (Levinsson et al., 2024)
The pilot SE questions provide multiple pathways for faculty and instructors to demonstrate effective teaching
The pilot SE questions may provide richer insight into the student experience
The variety of pilot SE questions allows departments, faculty, and instructors to emphasize one or several items that are most important to focus on (McCarthy, 2012)
The pilot SE questions can provide benchmarks for monitoring changes over time (Medina et al., 2019)
Set teaching goals
Reflect on teaching philosophy
Documentation in annual reviews and/or promotion/tenure
Focus on self-improvement rather than comparison to others
Providing time in class for students to complete the SE is essential to the success of the pilot. Setting aside class time increases inclusivity so that feedback reflects a majority of students’ perceptions rather than only a subset. Additionally, it provides instructors with the opportunity to discuss the importance of SEs and how they are used, which is related to more constructive student feedback.
A consultation with teaching development specialists in TLL can help with SE interpretation.
Avoid oversimplification of data about a complex process
Examine distributions of students’ responses, not just averages
Detect important trends within and across courses/semesters
Support teaching improvement plans that interpret the feedback within the course-specific context in collaboration with pedagogy experts
Consider the role of SEs in a more holistic approach to evaluating teaching effectiveness
Is there someone who can help individual instructors and/or departments make sense of the responses?
Yes! Staff from the Teaching + Learning Lab (TLL) will be available to collaborate with individual faculty/instructors and/or department heads to interpret the student feedback and plan actionable changes to instruction based on that feedback.
How might participation in the pilot impact tenure and/or promotion?
We encourage department heads to consider the most appropriate use of student feedback for faculty and instructors in their departments. Because of the emphasis on the student learning experience in the pilot SEs, the results will not be directly comparable to traditional SEs. That being said, faculty and instructors will receive richer, more detailed, and more actionable feedback that will better support them in self-reflection processes involved in the T/P process.
Must every faculty member/instructor in a participating department use the pilot subject evaluations?
Although we encourage widespread faculty/instructor participation, individual faculty and instructors within a participating department can opt out of the pilot. Students in subjects whose instructors have not opted in will receive the standard/traditional version of the SEs. We will work with department administrators to ensure that the relevant SE questions are provided to students.
In subjects with multiple instructors, will all teaching staff need to opt in?
Yes. All instructors must agree to participate in the pilot for the subject to be included in it.
What should students be told about the pilot?
We encourage instructors and faculty to stress the importance of their participation in the pilot and engagement with the questions. We will provide you with text/language you can use when speaking with your students about the pilot.
Which departments are participating?
We will update this information as departments sign on to participate in the pilot.
What are the next steps for departments that would like to learn more?
Department heads and individual faculty/instructors should reach out to sepilot@mit.edu to learn more.
Ambady, N., & Rosenthal, R. (1993). Half a minute: Predicting teacher evaluations from thin slices of nonverbal behavior and physical attractiveness. Journal of Personality and Social Psychology, 64(3), 431–441. https://doi.org/10.1037/0022-3514.64.3.431
Andersen, K., & Miller, E. D. (1997). Gender and Student Evaluations of Teaching. PS: Political Science & Politics, 30(2), 216–219. https://doi.org/10.2307/420499
Arbuckle, J., & Williams, B. D. (2003). Students’ perceptions of expressiveness: Age and gender effects on teacher evaluations. Sex Roles, 49(9), 507–516. https://doi.org/10.1023/A:1025832707002
Basow, S. A. (1995). Student evaluations of college professors: When gender matters. Journal of Educational Psychology, 87(4), 656–665. https://doi.org/10.1037/0022-0663.87.4.656
Braga, M., Paccagnella, M., & Pellizzari, M. (2011). Evaluating students’ evaluations of professors (SSRN Scholarly Paper 2004361). https://doi.org/10.2139/ssrn.2004361
Bray, J. H., & Howard, G. S. (1980). Interaction of teacher and student sex and sex role orientations and student evaluations of college instruction. Contemporary Educational Psychology, 5(3), 241–248. https://doi.org/10.1016/0361-476X(80)90047-8
Carrell, S. E., & West, J. E. (2010). Does professor quality matter? Evidence from random assignment of students to professors. Journal of Political Economy, 118(3), 409–432. https://doi.org/10.1086/653808
Cramer, K. M., & Alexitch, L. R. (2000). Student evaluations of college professors: Identifying sources of bias. Canadian Journal of Higher Education, 30(2), Article 2. https://doi.org/10.47678/cjhe.v30i2.183360
Deslauriers, L., McCarty, L. S., Miller, K., Callaghan, K., & Kestin, G. (2019). Measuring actual learning versus feeling of learning in response to being actively engaged in the classroom. Proceedings of the National Academy of Sciences, 116(39), 19251–19257. https://doi.org/10.1073/pnas.1821936116
Fandt, P. M., & Stevens, G. E. (1991). Evaluation bias in the business classroom: Evidence relating to the effects of previous experiences. The Journal of Psychology, 125(4), 469–477. https://doi.org/10.1080/00223980.1991.10543309
Hornstein, H. A. (2017). Student evaluations of teaching are an inadequate assessment tool for evaluating faculty performance. Cogent Education, 4(1), 1304016. https://doi.org/10.1080/2331186X.2017.1304016
Johnson, V. E. (2003). Grade inflation: A crisis in college education. Springer Verlag.
Kogan, L. R., Schoenfeld-Tacher, R., & Hellyer, P. W. (2010). Student evaluations of teaching: Perceptions of faculty based on gender, position, and rank. Teaching in Higher Education, 15(6), 623–636. https://doi.org/10.1080/13562517.2010.491911
Kreitzer, R. J., Sweet-Cushman, J. (2021). Evaluating Student Evaluations of Teaching: a Review of Measurement and Equity Bias in SETs and Recommendations for Ethical Reform. Journal of Academic Ethics, 1-12).
Levinsson, H., Nilsson, A., Mårtensson, K., & Persson, S. D. (2024). Course design as a stronger predictor of student evaluation of quality and student engagement than teacher ratings. Higher Education. https://doi.org/10.1007/s10734-024-01197-y
Linse, A. R. (2017). Interpreting and using student ratings data: Guidance for faculty serving as administrators and on evaluation committees. Studies in Educational Evaluation, 54, 94–106. https://doi.org/10.1016/j.stueduc.2016.12.004
MacNell, L., Driscoll, A., & Hunt, A. N. (2015). What’s in a name: Exposing gender bias in student ratings of teaching. Innovative Higher Education, 40(4), 291–303. https://doi.org/10.1007/s10755-014-9313-4
Marsh, H.W. & Roche, L. A. (1997). Making students’ evaluations of teaching effectiveness effective: The critical issues of validity, bias, and utility. American Psychologist, 52(11), 1187-1197. https://psycnet.apa.org/doi/10.1037/0003-066X.52.11.1187
McCarthy, M. (2012). Using student feedback as one measure of faculty teaching effectiveness. .In M.E. Kite (Ed.), Effective evaluation of teaching: A guide for faculty and administrators. Retrieved from the Society for the Teaching of Psychology website: http://teachpsych.org/ebooks/evals2012/index.php
Medina, M. S., Smith, W. T., Kolluru, S., Sheaffer, E. A., & DiVall, M. (2019). A review of strategies for designing, administering, and using student ratings of instruction. American Journal of Pharmaceutical Education, 83(5), 7177. https://doi.org/10.5688/ajpe7177
Miller, J., & Chamberlin, M. (2000). Women are teachers, men are professors: A study of student perceptions. Teaching Sociology, 28(4), 283. https://doi.org/10.2307/1318580
Mitchell, K. M. W., & Martin, J. (2018). Gender bias in student evaluations. PS: Political Science & Politics, 51(3), 648–652. https://doi.org/10.1017/S104909651800001X
Reid, L. D. (2010). The role of perceived race and gender in the evaluation of college teaching on RateMyProfessors.Com. Journal of Diversity in Higher Education, 3(3), 137–152. https://doi.org/10.1037/a0019865
Rosen, A. S. (2018). Correlations, trends and potential biases among publicly accessible web-based student evaluations of teaching: A large-scale study of RateMyProfessors.com data. Assessment & Evaluation in Higher Education, 43(1), 31–44. https://doi.org/10.1080/02602938.2016.1276155
Stark, P. B & Freishtat, R. (2014). An Evaluation of Course Evaluations, ScienceOpen DOI: 10.14293/S2199-1006.1.SOR-EDU.AOFRQA.v1
Uttl, B., White, C. A., & Gonzalez, D. W. (2017). Meta-analysis of faculty’s teaching effectiveness: Student evaluation of teaching ratings and student learning are not related. Studies in Educational Evaluation, 54, 22–42. https://doi.org/10.1016/j.stueduc.2016.08.007
Wachtel, H. K. (1998). Student evaluation of college teaching effectiveness: A brief review. Assessment & Evaluation in Higher Education, 23(2), 191–212. https://doi.org/10.1080/0260293980230207
Wagner, N., Rieger, M., & Voorvelt, K. (2016). Gender, ethnicity and teaching evaluations: Evidence from mixed teaching teams. Economics of Education Review, 54, 79–94. https://doi.org/10.1016/j.econedurev.2016.06.004
Worthington, A. C. (2002). The impact of student perceptions and characteristics on teaching evaluations: A case study in finance education. Assessment & Evaluation in Higher Education, 27(1), 49–64. https://doi.org/10.1080/02602930120105054