Abstract is missing.
- How Good are Modern LLMs in Generating Relevant and High-Quality Questions at Different Bloom's Skill Levels for Indian High School Social Science Curriculum?Nicy Scaria, Suma Dharani Chenna, Deepak N. Subramani. 1-10 [doi]
- Synthetic Data Generation for Low-resource Grammatical Error Correction with Tagged Corruption ModelsFelix Stahlberg, Shankar Kumar. 11-16 [doi]
- Pillars of Grammatical Error Correction: Comprehensive Inspection Of Contemporary Approaches In The Era of Large Language ModelsKostiantyn Omelianchuk, Andrii Liubonko, Oleksandr Skurzhanskyi, Artem N. Chernodub, Oleksandr Korniienko, Igor Samokhin. 17-33 [doi]
- Using Adaptive Empathetic Responses for Teaching EnglishLi Siyan, Teresa Shao, Julia Hirschberg, Zhou Yu. 34-53 [doi]
- Beyond Flesch-Kincaid: Prompt-based Metrics Improve Difficulty Classification of Educational TextsDonya Rooein, Paul Röttger, Anastassia Shaitarova, Dirk Hovy. 54-67 [doi]
- Large Language Models Are State-of-the-Art Evaluator for Grammatical Error CorrectionMasamune Kobayashi, Masato Mita, Mamoru Komachi. 68-77 [doi]
- Can Language Models Guess Your Identity? Analyzing Demographic Biases in AI Essay ScoringAlexander Kwako, Christopher Michael Ormerod. 78-86 [doi]
- Automated Scoring of Clinical Patient Notes: Findings From the Kaggle Competition and Their Translation into PracticeVictoria Yaneva, King Yiu Suen, Le An Ha, Janet Mee, Milton Quranda, Polina Harik. 87-98 [doi]
- A World CLASSE Student Summary CorpusScott A. Crossley, Perpetual Baffour, Mihai Dascalu, Stefan Ruseti. 99-107 [doi]
- Improving Socratic Question Generation using Data Augmentation and Preference OptimizationNischal Ashok Kumar, Andrew S. Lan. 108-118 [doi]
- Scoring with Confidence? - Exploring High-confidence Scoring for Saving Manual Grading EffortMarie Bexte, Andrea Horbach, Lena Schützler, Oliver Christ, Torsten Zesch. 119-124 [doi]
- Predicting Initial Essay Quality Scores to Increase the Efficiency of Comparative Judgment AssessmentsMichiel De Vrindt, Anaïs Tack, Renske Bouwer, Wim van den Noortgate, Marije Lesterhuis. 125-136 [doi]
- Improving Transfer Learning for Early Forecasting of Academic Performance by Contextualizing Language ModelsAhatsham Hayat, Bilal Khan, Mohammad Hasan. 137-148 [doi]
- Can GPT-4 do L2 analytic assessment?Stefano Bannò, Hari Krishna Vydana, Kate M. Knill, Mark J. F. Gales. 149-164 [doi]
- Using Program Repair as a Proxy for Language Models' Feedback Ability in Programming EducationCharles Koutcheme, Nicola Dainese, Arto Hellas. 165-181 [doi]
- Automated Evaluation of Teacher Encouragement of Student-to-Student Interactions in a Simulated Classroom DiscussionMichael Ilagan, Beata Beigman Klebanov, Jamie N. Mikeska. 182-198 [doi]
- Explainable AI in Language Learning: Linking Empirical Evidence and Theoretical Concepts in Proficiency and Readability Modeling of PortugueseLuisa Ribeiro-Flucht, Xiaobin Chen, Detmar Meurers. 199-209 [doi]
- Fairness in Automated Essay Scoring: A Comparative Analysis of Algorithms on German Learner Essays from Secondary EducationNils-Jonathan Schaller, Yuning Ding, Andrea Horbach, Jennifer Meyer, Thorben Jansen. 210-221 [doi]
- Improving Automated Distractor Generation for Math Multiple-choice Questions with Overgenerate-and-rankAlexander Scarlatos, Wanyong Feng, Andrew S. Lan, Simon Woodhead 0002, Digory Smith. 222-231 [doi]
- Identifying Fairness Issues in Automatically Generated Testing ContentKevin Stowe, Benny Longwill, Alyssa Francis, Tatsuya Aoyama, Debanjan Ghosh, Swapna Somasundaran. 232-250 [doi]
- Towards Automated Document Revision: Grammatical Error Correction, Fluency Edits, and BeyondMasato Mita, Keisuke Sakaguchi, Masato Hagiwara, Tomoya Mizumoto, Jun Suzuki 0001, Kentaro Inui. 251-265 [doi]
- Evaluating Vocabulary Usage in LLMsMatthew Durward, Christopher Thomson. 266-282 [doi]
- Exploring LLM Prompting Strategies for Joint Essay Scoring and Feedback GenerationMaja Stahl, Leon Biermann, Andreas Nehring, Henning Wachsmuth. 283-298 [doi]
- Towards Fine-Grained Pedagogical Control over English Grammar Complexity in Educational Text GenerationDominik Glandorf, Detmar Meurers. 299-308 [doi]
- LLMs in Short Answer Scoring: Limitations and Promise of Zero-Shot and Few-Shot ApproachesImran Chamieh, Torsten Zesch, Klaus Giebermann. 309-315 [doi]
- Automated Essay Scoring Using Grammatical Variety and Errors with Multi-Task Learning and Item Response TheoryKosuke Doi, Katsuhito Sudoh, Satoshi Nakamura 0001. 316-329 [doi]
- Error Tracing in Programming: A Path to Personalised FeedbackMartha Shaka, Diego Carraro, Kenneth N. Brown. 330-342 [doi]
- Improving Readability Assessment with Ordinal Log-LossHo Hung Lim, John Lee. 343-350 [doi]
- Automated Sentence Generation for a Spaced Repetition SoftwareBenjamin Paddags, Daniel Hershcovich, Valkyrie Savage. 351-364 [doi]
- Using Large Language Models to Assess Young Students' Writing RevisionsTianwen Li, Zhexiong Liu, Lindsay Clare Matsumura, Elaine Wang 0001, Diane J. Litman, Richard Correnti. 365-380 [doi]
- Automatic Crossword Clues Extraction for Language LearningSantiago Berruti, Arturo Collazo, Diego Sellanes, Aiala Rosá, Luis Chiruzzo. 381-390 [doi]
- Anna Karenina Strikes Again: Pre-Trained LLM Embeddings May Favor High-Performing LearnersAbigail Gurin Schleifer, Beata Beigman Klebanov, Moriah Ariely, Giora Alexandron. 391-402 [doi]
- Assessing Student Explanations with Large Language Models Using Fine-Tuning and Few-Shot LearningDan Carpenter, Wookhee Min, Seung Y. Lee, Gamze Ozogul, Xiaoying Zheng, James C. Lester. 403-413 [doi]
- Harnessing GPT to Study Second Language Learner Essays: Can We Use Perplexity to Determine Linguistic Competence?Ricardo Muñoz Sánchez, Simon Dobnik, Elena Volodina. 414-427 [doi]
- BERT-IRT: Accelerating Item Piloting with BERT Embeddings and Explainable IRT ModelsKevin P. Yancey, Andrew Runge, Geoffrey T. LaFlair, Phoebe Mulcaire. 428-438 [doi]
- Transfer Learning of Argument Mining in Student EssaysYuning Ding, Julian Lohmann, Nils-Jonathan Schaller, Thorben Jansen, Andrea Horbach. 439-449 [doi]
- Building Robust Content Scoring Models for Student Explanations of Social Justice Science IssuesAllison Bradford, Kenneth Steimel, Brian Riordan 0001, Marcia C. Linn. 450-458 [doi]
- From Miscue to Evidence of Difficulty: Analysis of Automatically Detected Miscues in Oral Reading for Feedback PotentialBeata Beigman Klebanov, Michael Suhan, Tenaha O'Reilly, Zuowei Wang. 459-469 [doi]
- Findings from the First Shared Task on Automated Prediction of Difficulty and Response Time for Multiple-Choice QuestionsVictoria Yaneva, Kai North, Peter Baldwin, Le An Ha, Saed Rezayi, Yiyun Zhou, Sagnik Ray Choudhury, Polina Harik, Brian Clauser. 470-482 [doi]
- Predicting Item Difficulty and Item Response Time with Scalar-mixed Transformer Encoder Models and Rational Network Regression HeadsSebastian Gombert, Lukas Menzel, Daniele Di Mitri, Hendrik Drachsler. 483-492 [doi]
- UnibucLLM: Harnessing LLMs for Automated Prediction of Item Difficulty and Response Time for Multiple-Choice QuestionsAna-Cristina Rogoz, Radu-Tudor Ionescu. 493-502 [doi]
- The British Council submission to the BEA 2024 shared taskMariano Felice, Zeynep Duran Karaoz. 503-511 [doi]
- ITEC at BEA 2024 Shared Task: Predicting Difficulty and Response Time of Medical Exam Questions with Statistical, Machine Learning, and Language ModelsAnaïs Tack, Siem Buseyne, Changsheng Chen, Robbe D'hondt, Michiel De Vrindt, Alireza Gharahighehi, Sameh Metwaly, Felipe Kenji Nakano, Ann-Sophie Noreillie. 512-521 [doi]
- Item Difficulty and Response Time Prediction with Large Language Models: An Empirical Analysis of USMLE ItemsOkan Bulut, Guher Gorgun, Bin Tan. 522-527 [doi]
- Utilizing Machine Learning to Predict Question Difficulty and Response Time for Enhanced Test ConstructionRishikesh Fulari, Jonathan Rusert. 528-533 [doi]
- Leveraging Physical and Semantic Features of text item for Difficulty and Response Time Prediction of USMLE QuestionsGummuluri Venkata Ravi Ram, Ashinee Kesanam, Anand Kumar M. 534-541 [doi]
- UPN-ICC at BEA 2024 Shared Task: Leveraging LLMs for Multiple-Choice Questions Difficulty PredictionGeorge Dueñas, Sergio Jimenez, Geral Mateus Ferro. 542-550 [doi]
- Using Machine Learning to Predict Item Difficulty and Response Time in Medical TestsMehrdad Yousefpoori-Naeim, Shayan Zargari, Zahra Hatami. 551-560 [doi]
- Large Language Model-based Pipeline for Item Difficulty and Response Time Estimation for Educational AssessmentsHariram Veeramani, Surendrabikram Thapa, Natarajan Balaji Shankar, Abeer Alwan. 561-566 [doi]
- UNED team at BEA 2024 Shared Task: Testing different Input Formats for predicting Item Difficulty and Response Time in Medical ExamsÁlvaro Rodrigo, Sergio Moreno-Álvarez, Anselmo Peñas. 567-570 [doi]
- The BEA 2024 Shared Task on the Multilingual Lexical Simplification PipelineMatthew Shardlow, Fernando Alva-Manchego, Riza Batista-Navarro, Stefan Bott, Saul Calderon Ramirez, Rémi Cardon, Thomas François, Akio Hayakawa, Andrea Horbach, Anna Hülsing, Yusuke Ide, Joseph Marvin Imperial, Adam Nohejl, Kai North, Laura Occhipinti, Nelson Perez-Rojas, Nishat Raihan, Tharindu Ranasinghe, Martin Solis Salazar, Sanja Stajner, Marcos Zampieri, Horacio Saggion. 571-589 [doi]
- TMU-HIT at MLSP 2024: How Well Can GPT-4 Tackle Multilingual Lexical Simplification?Taisei Enomoto, Hwichan Kim, Tosho Hirasawa, Yoshinari Nagai, Ayako Sato, Kyotaro Nakajima, Mamoru Komachi. 590-598 [doi]
- ANU at MLSP-2024: Prompt-based Lexical Simplification for English and SinhalaSandaru Seneviratne, Hanna Suominen. 599-604 [doi]
- ISEP_Presidency_University at MLSP 2024 Shared Task: Using GPT-3.5 to Generate Substitutes for Lexical SimplificationBenjamin Dutilleul, Mathis Debaillon, Sandeep Mathias. 605-609 [doi]
- Archaeology at MLSP 2024: Machine Translation for Lexical Complexity Prediction and Lexical SimplificationPetru Cristea, Sergiu Nisioi. 610-617 [doi]
- RETUYT-INCO at MLSP 2024: Experiments on Language Simplification using Embeddings, Classifiers and Large Language ModelsIgnacio Sastre, Leandro Alfonso, Facundo Fleitas, Federico Gil, Andrés Lucas, Tomás Spoturno, Santiago Góngora, Aiala Rosá, Luis Chiruzzo. 618-626 [doi]
- GMU at MLSP 2024: Multilingual Lexical Simplification with Transformer ModelsDhiman Goswami, Kai North, Marcos Zampieri. 627-634 [doi]
- ITEC at MLSP 2024: Transferring Predictions of Lexical Difficulty from Non-Native ReadersAnaïs Tack. 635-639 [doi]