top of page

在线考试离我们有多近

Updated: Jun 29, 2024

English version follows the Chinese version.




先说说机考,也就是在计算机上进行的考试,它可能也是某种形式的在线考试吧。


今年3月份的《IBDP通讯》报道了IB组织前期开展的针对IB学校数字化考试(digital examinations) 准备程度的调查结果。该报道显示,2509所IB学校参与了本次调查,占到所有DP和CP学校(3739)的67%。


参与调查的学校中有74%表示到2026年可以准备好迎接数字化考试,而到2027年准备就绪的学校的比例可达84%。虽然一些学校也表达了对网络可靠性、可用设备的数量、和教师的经验等方面的担忧,但是总体来说有接近八成的DP和CP学校表示期待数字化的IB全球考试。


其实早在2016年IB就推出了MYP的机考(eAssessment)。和传统的纸笔考试一样,学生还是要在一个教室里,同样有监考老师在场,唯一不同的是学生需要在屏幕上进行读题和答题。我估计DP和CP的数字化考试应该与MYP的机考差不多。


其它的一些考试局也在推进这样的机考模式,只是进程各有不同而已。


美国大学理事会(College Board)从2023年开始推出SAT和AP机考。2024年的八门AP机考还只是在美国境内实行,但是从2025年开始,总共有九门AP学科将在全球范围内实施机考。(apcentral.collegeboard.org/exam-administration-ordering-scores/digital-ap-exams)


英国的A-Level考试也在逐渐被搬到屏幕上。我查看了几大A-Level考试局的网站,培生(Pearson)从2022年开始试点机考,并且计划到2026年把14门学科的考试搬上屏幕。剑桥(Cambridge)计划到2030年将绝大部分考试转化成机考,而AQA也计划到2030年实现部分学科机考。


其实这种有老师现场监考的机考应该不是什么新事物了。我记得我本人早在2000年前后就参加过当时中国流行的计算机中级考试,也是机考,考生在有老师监考的计算机房里在电脑上回答题目,只不过我猜想当时的考试题目可能不是通过互联网实时获取和更新的。当时的互联网应该还没有像现在这么发达。有一种可能是,当时的考试题目是拷贝到了本地计算机上的。而上面所提到的几个国际课程的机考是考生通过登入互联网平台实时获得考题并答题,而考生所输入的答案也是实时上传到考试机构的主机或者云端的。


我现在工作的加拿大NOIC学院开设了IBDP和OSSD。IBDP的数字化考试还没有实施,不过IBDP考卷的数字化阅卷倒是实施了很长时间了。现在的做法是学生用纸笔答题,学校负责将考卷快递到由IB指定的扫描中心,然后扫描好的试卷会通过网络传输给阅卷的教师进行在线实时批改。我很早就成为了IBDP数学考官,因此经历从纸笔阅卷到在线阅卷的转变,现在早已习惯了在屏幕上批改复杂的数学题目了。当然将整个数学考试搬到屏幕上让学生通过电脑考试则完全是另外一个难度级别的工程了。而IB组织似乎正在进行这项工作的准备。


OSSD没有多少统一的考试,唯一一个需要学校组织的就是文化考试(Literacy Test)。这个OSSLT在疫情期间已经被搬到了网上,现在即使疫情结束了它也保持了机考的方式。我们学校每年要组织两个OSSLT机考,其实也就是把学生组织到一个教室,安排监考,指导学生在屏幕上按照提示和题目进行答题而已。


负责OSSLT的部门也允许学校为无法到学校参加考试的学生在异地组织考试。我本人就协调过这样的一场考试,成功地组织了几十个学生在中国境内参加了由我们加拿大校区牵头安排的机考。我们学校在国内有办公室和同事,他们负责了在现场的考务工作,而我则负责培训的考务人员,考试过程中的后台监控,以及和考试机构的沟通工作。如果没有时差的话,我觉得这样的考试组织与在学校内部组织的机考似乎没有太大的不同,只不过组织起来需要更多跨地区的工作人员在整个过程中保持随时的沟通和密切的协作。


真正的在线考试应该是没有老师在现场监考的机考。


我首先想到的是在新冠疫情期间火起来的多邻国(Duolingo)语言考试。虽然我本人没有参加过这个考试,但是我听到过当时不少学生会把它当成当时无法现场参加的雅思和托福考试的替代品。多邻国语言考试就是学生在家通过电脑和互联网可以完成整个考试,没有老师在学生身边监考。


根据多邻国自己的介绍,为了确保考试结果的可信度,整个测试和后期核实过程遵循严格的管理流程并使用了很多技术手段。每一个在线的考试都是全程录像,然后这个录像会被通过多个方式进行检查,其中包括第一轮的人工智能工具的检查,然后是两轮由真人进行的检查,总共要核查几十项指标,以确保考试的学术诚信。如果那两轮人工核查的结果有不一致之处,那么公司将会安排第三轮的人工复查。


多邻国的语言考试还是一种计算机自适应的考试,也就是说测试会根据学生的能力水平不断地调整题目。这样每一个测试既为学生量身定制,也可以变得不那么冗长。也就是说,其它非自适应的测试往往为了满足大部分考生的特点不得不保留了许多本来可以省略的考题。我的理解是,每个考生参加多邻国测试时所看到的题目基本上都是不一样的,这大大减少了考生之间作弊的可能性。多邻国还监测和公布考试题目在所有测试中出现的频率(item exposure rate)和整个考试卷之间的重复度(exam overlap rate),前者的均值是0.2%,后者为0.62%,据称这是所有标准化考试中最低的。(englishtest.duolingo.com)


多邻国的语言考试是完全没有老师在考试现场或者在线进行监考,它是通过考试的录像来进行学术诚信的鉴别。在教师实地监考的环境下学生也可以参加在线考试。实际上这种方式也远已不是想象场景了,我相信在三年新冠疫情期间有很多学校就组织过这样的考试。


我在上海华旭工作期间正逢疫情,有很长一段时间我就是负责学校的在线教学,而在线考试是其中的重要部分。我在2022年6月份的时候和当时的一位同事写过一篇博客文章《未雨绸缪,线上教学管理是现代校长必备能力》,其中就记录了我们当时的一些做法。


为了确保有老师监考的在线考试的可信度,我们当时制定了一系列的制度和流程,其中包括在线考试政策、在线监考教师工作规范,和在线监考流程检查清单。在线监考的老师需要根据那个清单在考试之前、中间和考试之后检查各项规范和步骤。比如说,老师需要在考试之前要求学生展示房间内部环境,以确保考生是在无他人协助之下进行考试。老师还需要检查学生的摄像头位置,以确保考生始终在教师的可监控范围之内。我们还规定了学生通过在线学习管理平台提交考卷的方法和时间,老师需要确保所有考生全部提交之后才能允许考生离开“考场”。我记得最初的时候我们使用的是ZOOM,后来它需要收费之后就转为腾讯会议了。我们还要求监考老师将考试的过程全程录像,以便在出现可疑学术诚信问题的时候核查。


在上海华旭工作期间我们高中部还使用过一段时间英国某个考试机构编制的入学考试,考试可以在线进行,学生可以到学校来在招生老师的现场监考下完成,或者在家考试,而招生老师则是在线监考。采用这个考试的一个很重要的原因也是因为疫情期间学生无法到上海或者到学校来参加入学考试。


显然,这种有人在线监考的在线考试组织起来比多邻国的无人监考的在线考试要麻烦很多,成本也高许多,就好像是把传统的考场中的老师和学生搬到了线上,他们必须在同一时间出现在同一空间,不同的是师生在线上空间而已。


在上海华旭组织的线上考试中的教师和学生互相都很熟悉,所以学生一般也不会投机取巧,即使有些作弊现象,也很容易被老师在现场或者之后的答卷上找到迹象。试想如果监考人员根本不认识被监考的学生,那么我觉得这个工作的难度可能会大大提高,比如在线身份认证就将是一个问题,还有监考老师需要平均地关注每一个学生,因为他不知道任何一个学生的背景信息,比如哪些学生成绩比较差因此容易想到作弊。


现在我服务的NOIC学院既开设实体学校的课程项目,也提供在线课程,主要是OSSD和IBDP。在线教学必然会涉及到学生评估和在线考试。我们基本上也是采用的有监考教师实时监考的在线考试模式,而考试使用的平台就是学生平时在线学习和实时上课的专业教学平台。同时,我们也在考试内容上下功夫,开发多套试题同时随机化考题大大减少了学生作弊的可能性。我们也制定了严格的在线考试规范、流程和政策,所有这个体系实施下来的效果我认为还是相当不错的。每个学期我们要组织数百场在线测试和考试,绝大部分的考试过程和结果都是符合管理和教师的预期的,出现问题的场次和学生极少。


总结一下,以上讨论了三种在线考试的形式。


第一种是有老师在考生的同一个房间里监考的考试;第二种是有老师在考生的同一个线上教室里监考的考试;第三种是根本没有真人监考的在线考试。


根据我所了解的信息,我感觉几大国际高中课程的考试局以后采用基本上都是第一种,也即组织学生到一个物理房间里,有老师在现场监考的形式。由于在线课程项目的需要,第二种形式可能出现在很多开设在线课程的学校,尤其是高中和大学。当学生无法在指定的时间到学校参加考试的时候,无疑通过老师在线监考的方式对这些学校来说是一种最优化的选择。这种方式除了组织起来比较麻烦,并且需要投入很多管理人力之外,对前期技术开发的要求并不高。

而第三种形式,也就是多邻国语言测试所采用的形式就需要对后期检查的技术和流程有非常严密和严格的设计,而这种形式对考试题目的可选数量和提供方式的要求也是最高的。因此我不认为大部分考试局和学校会考虑这种无人监考的在线考试,除非以后人工智能的监考手段已经发达到了可以杜绝作弊现象的发生,并且这种技术手段的成本也可以降低到大面积使用的范围之内。


线上学习已经成为现代教育中不可缺少的部分,而且它的比例可能会随着技术的发展、学生技能的提高,以及在线内容的愈加丰富变得越来越普遍。评估是学习的一个不可缺少的组成部分,因此在线考试是学生、教师、学校和考试组织必须面对和解决的问题。我认为现在已经不是一味地讨论各种在线考试的困难和缺点的时候了,所有人的注意力应该切换到选择哪一种适合的在线考试形式和如何解决在线考试过程的各种可能出现的问题了。


有各种迹象表明,我们很可能正在进入一个人工智能的时代。就在线考试而言,人类对这个过程的管理和鉴别的能力已经很明显地到达了一个边界了,那么人工智能在解决在线考试,尤其是无真人监考的在线考试学术诚信问题这个课题中是否可以到达人类无法企及的程度,这可能是所有在线考试的利益相关者应该密切关注的趋势。



First, let's talk about computer-based tests, which are tests conducted on a computer, possibly in some form of testing through internet.


The IBDP Newsletter from March of this year reported the results of a survey conducted by the IB organization regarding the readiness of IB schools for digital examinations. The report showed that 2,509 IB schools participated in the survey, accounting for 67% of all DP and CP schools (3,739). Among the participating schools, 74% indicated they could be ready for digital exams by 2026, and 84% by 2027. Although some schools also expressed concerns about network reliability, the number of available devices, and the experience of teachers, overall, nearly 80% of DP and CP schools expressed anticipation for digital IB global exams.


In fact, as early as 2016, the IB launched computer-based exams (eAssessment) for it middle year programme, MYP. Similar to traditional paper-based exams, students still take the exam in a classroom with a proctor present. The only difference is that students read and answer questions on a screen. I assume that digital exams for the DP and CP would be similar to the MYP computer-based exams.


Other examination boards are also promoting this model of computer-based testing, albeit at different paces.


The College Board in the United States started rolling out SAT and AP computer-based exams in 2023. The eight AP exams in 2024 will only be implemented within the United States, but starting in 2025, a total of nine AP subjects will be conducted as computer-based exams globally. (apcentral.collegeboard.org/exam-administration-ordering-scores/digital-ap-exams)


The UK's A-Level exams are also gradually moving online. I checked the websites of several major A-Level examination boards. Pearson began piloting computer-based exams in 2022 and plans to move 14 subjects online by 2026. Cambridge plans to convert most exams to computer-based by 2030, and AQA also aims to implement computer-based exams for some subjects by 2030.


Actually, computer-based testing with in-person proctoring is not a new concept. I remember participating in an intermediate computer-skills exam in China around 2000, where candidates answered questions on computers in a monitored computer lab. However, I suspect the exam questions back then were not obtained and updated in real-time over the internet, as the internet was not as advanced as it is now. It's possible the exam questions were copied onto local computers. In contrast, the computer-based exams in the high school international programs mentioned above involve students logging onto an online platform to receive and answer questions in real-time, with answers uploaded in real-time to the exam organization's server or cloud.


At my current workplace, we offer IBDP and OSSD. Although digital exams for the IBDP have not yet been implemented, digital marking of IBDP exam papers has been in place for a long time. Currently, students write their answers with pen and paper, schools are responsible for mailing the exam papers to IB-designated scanning centers, and the scanned papers are then transmitted online to be marked in real-time by teachers. I became an IBDP math examiner early on, experiencing the transition from paper-based to online marking, and I am now accustomed to grading complex math problems on a screen. However, moving the entire math exam to a screen for students to take on a computer would be a significantly more challenging task. It seems the IB organization is preparing for this.


OSSD has few unified exams, with the only required school-organized exam being the Literacy Test (OSSLT). During the pandemic, the OSSLT was moved online, and even after the pandemic, it has remained computer-based. Our school organizes two OSSLT computer-based exams each year, essentially gathering students in a classroom, proctoring them, and guiding them to answer questions on the screen.


The department responsible for OSSLT also allows schools to organize exams for students who cannot come to school. I coordinated such an exam, successfully organizing dozens of students in China to participate in a computer-based exam led by our Canadian campus. Our school has offices and colleagues in China who handled the onsite exam arrangements, while I was responsible for training the exam personnel, backend monitoring during the exam, and communication with the exam organization. Without the time zone difference, I feel organizing such exams would be no different from organizing them within the school, except that it requires more inter-regional personnel to maintain continuous communication and close cooperation throughout the process.


A truly online exam should be a computer-based exam without onsite proctoring by teachers.


The first example that comes to mind is the Duolingo language test that became popular during the COVID-19 pandemic. Although I haven't taken this test myself, I heard many of my students used it as a substitute for the IELTS and TOEFL tests, which they couldn't take in person at that time. The Duolingo language test allows students to complete the entire exam at home through a computer and the internet, without a teacher proctoring in person.

According to Duolingo, to ensure the credibility of exam results, the entire test and post-test verification process follow strict management procedures and use many technical means. Every online exam is entirely recorded, and this recording is checked in various ways, including an initial AI tool check followed by two rounds of human review, checking dozens of indicators to ensure academic integrity. If there is any discrepancy between the two human reviews, a third round of human review is arranged.


The Duolingo language test is also a computer-adaptive test, meaning the questions adjust to the student's ability level, making each test tailored to the student and less lengthy. Non-adaptive tests often retain many questions that could otherwise be omitted to accommodate most test-takers. My understanding is that each test-taker sees unique questions, greatly reducing the possibility of cheating between test-takers. Duolingo also monitors and publishes the item exposure rate (the frequency of each question's appearance in all tests) and the exam overlap rate (the similarity between different exams), with average values of 0.2% and 0.62%, respectively, claimed to be the lowest among all standardized tests. (englishtest.duolingo.com)


The Duolingo language test has no teacher proctoring in person or online; instead, academic integrity is ensured through test recordings. Students can also take online exams with teachers proctoring onsite. In fact, this scenario is no longer a fantasy. I believe many schools organized such exams during the three years of the COVID-19 pandemic.

During my tenure at Shanghai Arete Bilingual School, which coincided with the pandemic, I was in charge of online teaching for a considerably long time, and online exams were an important part of it. In June 2022, a colleague and I wrote a blog post titled "Preparing for the Future: Online Teaching Management is a Modern Principal's Essential Skill," recording some of our practices at the time.


To ensure the credibility of online exams proctored by teachers, we developed a series of policies and procedures, including an online exam policy, online proctor guidelines, and an online proctor checklist. Proctors needed to follow the checklist before, during, and after the exam to check various regulations and steps. For example, teachers had to ask students to show their room environment before the exam to ensure they were taking the exam without assistance. Teachers also had to check the camera position to ensure the student was always within the teacher's view. We also set methods and times for students to submit exam papers through the online learning management platform, ensuring all papers were submitted before allowing students to leave the "exam room." Initially, we used ZOOM, but switched to Tencent Meeting after it became a paid service. We also required proctors to record the entire exam process for later review if any academic integrity issues arose.


During my time at Shanghai Arete Bilingual School, our high school department also used an online entrance exam from a UK exam organization, which students could take either at school with an onsite proctor or at home with an online proctor. One important reason for using this exam was that students couldn't come to Shanghai or the school for the entrance exam due to the pandemic.


Clearly, organizing an online exam with a teacher proctoring remotely is much more cumbersome and costly than an unproctored online exam like Duolingo's. It's like moving the traditional classroom with teachers and students online, requiring them to be in the same space at the same time, only in an online space.


In the online exams organized at the Shanghai school, teachers and students were familiar with each other, so students generally wouldn't cheat. Even if there could be some cheating, teachers could easily detect it during the exam or later in the answer sheets. Imagine if the proctor didn't know any of the students being monitored; the difficulty of this work would significantly increase. For instance, online identity verification would be a problem, and the proctor would need to equally focus on every student without knowing their background information, such as which students are weaker and more likely to cheat.


At my current school, we offer both in-person and online courses, mainly OSSD and IBDP. Online teaching inevitably involves student assessment and online exams. We mainly adopt the online exam mode with real-time proctoring, using the same professional teaching platform for online learning and real-time classes. We also work on the exam content, developing multiple sets of questions and randomizing them to reduce cheating. We have established strict online exam regulations, procedures, and policies, and I think the overall implementation has been quite effective. Each semester, we organize hundreds of online tests and exams, with most processes and results meeting management and teacher expectations, and only a few problems occurring.


To summarize, three forms of online exams were discussed above:

  1. Exams with teachers proctoring in the same room as the students.

  2. Exams with teachers proctoring in the same online classroom as the students.

  3. Exams without any real-time proctoring.


Based on the information I have, I believe the major international high school programs will mostly adopt the first form, organizing students in a physical room with onsite proctoring. Due to the needs of online programs, the second form may appear in many schools offering online courses, especially high schools and universities. When students cannot attend exams at school at a designated time, proctoring online is undoubtedly the optimal choice for these schools. This method, although cumbersome to organize and requiring significant management manpower, does not have high technical development requirements upfront.

The third form, used by the Duolingo language test, requires very rigorous and strict technical and procedural design for post-exam verification, with the highest requirements for the number and availability of exam questions. Therefore, I do not believe most exam boards and schools would consider unproctored online exams unless AI monitoring methods develop to eliminate cheating and the costs of such technology reduce to a level for widespread use.


Online learning has become an indispensable part of modern education, and its proportion may increase with technological development, student skills improvement, and richer online content. Assessment is an integral part of learning, so online testing is an issue that students, teachers, schools, and exam organizations must face, accept, and adapt to. I think it's no longer about discussing the difficulties and disadvantages of various online exams but rather about choosing suitable forms of online exams and solving potential problems during online exams.


There are various signs that we may be entering an era of artificial intelligence although it is still in the early stage. Regarding online exams, human management and verification capabilities have clearly reached a limit. Whether AI can solve the academic integrity issues of online exams, especially unproctored online exams, to a level beyond human reach is a trend all stakeholders in online exams should closely monitor.

Comments


  • Facebook
  • Twitter
  • LinkedIn

©2020 by Xuefeng Huang. Proudly created with Wix.com

bottom of page