This article was inspired by a recent PETAL Speakers Series session, "Critical Perspectives on Student Course Evaluations" and aims to continue the conversation and research shared on how we evaluate teaching.
Contact life@miami.edu if you have any questions related to the approaches discussed.
Student Evaluations of Teaching (SETs) are used by institutions to gather feedback from students about the course and teaching effectiveness. SETs are typically administered in the form of anonymous, online surveys at the end of a semester. Students have a limited period of time to complete the survey, which can consist of closed and open-ended questions, rating the course experience and the instructor’s teaching. Faculty members can later access survey responses to review and make changes based on feedback. Many institutions factor in SETs when it comes to considering faculty members for hire or teaching awards. Oftentimes SETs may be the only deciding factor in promotion and tenure decisions.
SETs as the primary measure of teaching effectiveness in faculty review processes can disadvantage faculty from marginalized communities (American Sociological Association, 2019). Non-white instructors receive lower ratings when compared to their white colleagues and are evaluated less positively by white male students (McPherson & Jewell, 2007; Bavishi, 2010). In a study analyzing racial bias in SET results at a predominantly white institution, researchers found that Black faculty members received the overall lowest mean scores (Smith & Hawkins, 2011). Black faculty members also received the lowest mean scores on items that measure teaching effectiveness which is often related to personnel decisions such as tenure, promotion, and merit pay increases.
There is consistent bias against female faculty in SET results. Studies show that students found that male professors were evaluated more favorably on teaching effectiveness measures, even when female faculty exhibit the same teaching behaviors (Boring, Ottoboni, & Stark, 2016). Students also typically use gendered language when evaluating faculty members and more often comment on a woman’s appearance and personality in their evaluations (Mitchell & Martin, 2018). Moreover, women of color face unique challenges in higher education including issues of legitimacy, lack of institutional support, and tokenization that is also reflected in SET scores (Ford, 2011).
Students with English as their first language often view accents as a communication barrier and give slightly lower ratings to non-native English-speaking instructors. Students can have a prejudiced view of non-native speakers of English and are more likely to interrupt them, complete their sentences, and restate their ideas (Rubin & Smith, 1990). Faculty that are non-native speakers of English are also often perceived by students as poor teachers despite their pedagogical practices (Miller & Chamberlain, 2000).
There are an array of other factors that can impact SET scores including class size, the subject area, or in some instances, food. Randomized studies show that instructors that offered chocolate bars or cookies to students were rated substantially higher than students that did not receive them. (Youmans & Lee, 2007; Hessler, 2018). SET scores in small classes are also more greatly influenced by outliers and often receive higher ratings versus large enrollment courses (American Sociological Association, 2019).
It is evident from the research that SETs alone are insufficient to evaluate teaching effectiveness. In this section, we provide recommendations, including actionable steps for faculty and case studies from other institutions on how to holistically collect data and feedback that can be used toward improving teaching.
The University of Southern California utilizes peer review of teaching (PRT) for faculty evaluation alongside a teaching statement. They provide faculty detailed guides and templates for schools and departments to adopt, modify, or rewrite as needed. During the peer review process, faculty are reviewed by two peer evaluators who observe at least two classes each. Their Center for Excellence in Teaching provides faculty group training on using evaluative checklists for peer review along with facilitating norming sessions.
Since 2017, the University of Oregon began revising its teaching evaluation system in light of research indicating that student evaluations do not reflect teaching effectiveness and are often swayed by biases. They now use a holistic approach centered on peer review, self-reflection, and student feedback. The self-reflective portion not only allows instructors to reflect on their own teaching but also provides instructors with an opportunity to provide evaluators with context concerning student feedback.
The University of Michigan’s Center for Research on Teaching and Learning (CRLT) encourages faculty to have a more active role in their department or school’s development of teaching evaluation systems. Each school and department can implement a specific evaluative method based on faculty consensus and accommodate diverse teaching methods. The center recommends over 20 teaching evaluation methods that can be utilized and tailored to fit each department or college’s needs. Associate deans, chairs, or faculty committees can meet with the CRLT to discuss teaching evaluation methods. Additionally, the center recommends using student feedback in conjunction with other methods and not in isolation.
Vanderbilt University's Center for Teaching offers faculty the opportunity to receive mid-semester feedback through small-group analysis. This method involves a two-step process. A Center for Teaching consultant visits a faculty member’s class toward the end of a session and conducts anonymous assessments with students while the faculty leaves the room. In small groups, students answer questions related to learning outcomes and teaching, and then as a whole class, there is a discussion of their responses. The consultant compiles all student feedback into a report that is shared with the faculty member in a consultation meeting. At this time the consultant can support faculty in interpreting the data, finding areas of improvement, and planning a course of action.
Tufts University offers a non-evaluative Teaching Squares program through their Center for the Enhancement of Learning and Teaching. Throughout a semester, four faculty members from different disciplines agree to observe one another’s class and discuss their observations. Faculty meet as a group once to coordinate and learn about the process, and then a second time after observations to go over their insights and what they’ve learned. Teaching Squares are meant to inspire peer evaluation and self-reflection as instructors talk about pedagogy and what they have learned from one another throughout the process.
American Sociological Association (ASA). 2019. Statement on Student Evaluations of Teaching.https://www.asanet.org/sites/default/files/asa_statement_on_student_evaluations_of_teaching_feb132020.pdf Bavishi, A., Madera, J. M., & Hebl, M. R. (2010). The effect of professor ethnicity and gender on student evaluations: Judged before met. Journal of Diversity in Higher Education, 3(4), 245–256. https://doi.org/10.1037/a0020763. Boring, A., Ottoboni, K., & Stark, P. (2016). Student evaluations of teaching (mostly) do not measure teaching effectiveness. ScienceOpen Research. 1-11.https://www.scienceopen.com/hosted-document?doi=10.14293/S2199-1006.1.SOR-EDU.AETBZC.v1 Ford, K.A. (2011). Race, Gender, and Bodily (Mis)Recognitions: Women of Color Faculty Experiences with White Students in the College Classroom. The Journal of Higher Education 82(4), 444-478. doi:10.1353/jhe.2011.0026. Hessler, M., Pöpping, D. M., Hollstein, H., Ohlenburg, H., Arnemann, P. H., Massoth, C., Seidel, L. M., Zarbock, A., & Wenk, M. (2018). Availability of cookies during an academic course session affects evaluation of teaching. Medical education, 52(10), 1064–1072. https://doi.org/10.1111/medu.13627. Johnson-Bailey, J., Lee, M. (2005). Women of color in the academy: Where’s our authority in the classroom? Feminist Teacher: A Journal of the Practices, Theories, and Scholarship of Feminist Teaching, 15, 111-122. https://gws.arizona.edu/sites/gws.arizona.edu/files/Women%20of%20Color%20in%20the%20Academy.pdf. Laube, H., Massoni, K., Sprague, J., and Ferber, A. (2007). The impact of gender on the evaluation of teaching: what we know and what we can do.” NWSA Journal, 19 (3): 87–104. https://www.jstor.org/stable/40071230?seq=1#metadata_info_tab_contents Lawrence, J. (2018). Student evaluations of teaching are not valid. American Association of University Professors. https://www.aaup.org/article/student-evaluations-teaching-are-not-valid#.YjHmmy-B1q. McPherson, M. A., & Jewell, R. T. (2007). Leveling the playing field: should student evaluation scores be adjusted?, Social Science Quarterly, 88, 868-881. https://www.jstor.org/stable/42956226 Miller, J., & Chamberlin, M. (2000). Women Are Teachers, Men Are Professors: A Study of Student Perceptions. Teaching Sociology, 28(4), 283–298. https://doi.org/10.2307/1318580 Rubin, D. L., & Smith, K. A. (1990). Effects of accent, ethnicity, and lecture topic on undergraduates' perceptions of nonnative English-speaking teaching assistants. International Journal of Intercultural Relations, 14(3), 337–353. https://doi.org/10.1016/0147-1767(90)90019-S. Smith, B. P., & Hawkins, B. (2011). Examining Student Evaluations of Black College Faculty: Does Race Matter? The Journal of Negro Education, 80(2), 149–162. http://www.jstor.org/stable/41341117 Youmans, R. and Lee B. (2007). Fudging the numbers: distributing chocolate influences student evaluations of an undergraduate course.” Teaching of Psychology, 34 (4), 245-247.https://psychology.usu.edu/research/factotum/files/Fudging%20the%20Numbers_%20Distributing%20Chocolate%20Influences%20Student%20Evaluations.pdf.