IntrEx: A Dataset for Modeling Engagement in Educational Conversations
Abstract
IntrEx, a large dataset annotated for interestingness in educational conversations, shows that fine-tuned LLMs can predict human judgments of interestingness better than larger proprietary models, highlighting the role of linguistic and cognitive factors in engagement.
Engagement and motivation are crucial for second-language acquisition, yet maintaining learner interest in educational conversations remains a challenge. While prior research has explored what makes educational texts interesting, still little is known about the linguistic features that drive engagement in conversations. To address this gap, we introduce IntrEx, the first large dataset annotated for interestingness and expected interestingness in teacher-student interactions. Built upon the Teacher-Student Chatroom Corpus (TSCC), IntrEx extends prior work by incorporating sequence-level annotations, allowing for the study of engagement beyond isolated turns to capture how interest evolves over extended dialogues. We employ a rigorous annotation process with over 100 second-language learners, using a comparison-based rating approach inspired by reinforcement learning from human feedback (RLHF) to improve agreement. We investigate whether large language models (LLMs) can predict human interestingness judgments. We find that LLMs (7B/8B parameters) fine-tuned on interestingness ratings outperform larger proprietary models like GPT-4o, demonstrating the potential for specialised datasets to model engagement in educational settings. Finally, we analyze how linguistic and cognitive factors, such as concreteness, comprehensibility (readability), and uptake, influence engagement in educational dialogues.
Community
Our paper releases the first dataset about interestingness and expected interestingness in educational conversations. We studied how linguistic and cognitive features serve as predictors (concreteness, comprehensibility, information uptake, etc.) of interestingness and experimented with using the data as training signals for reward models. #EMNLP2025
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Neural Models and Language Model Prompting for the Multidimensional Evaluation of Open-Ended Conversations (2025)
- Cultivating Helpful, Personalized, and Creative AI Tutors: A Framework for Pedagogical Alignment using Reinforcement Learning (2025)
- Can LLMs Generate High-Quality Task-Specific Conversations? (2025)
- F2RVLM: Boosting Fine-grained Fragment Retrieval for Multi-Modal Long-form Dialogue with Vision Language Model (2025)
- DialogueForge: LLM Simulation of Human-Chatbot Dialogue (2025)
- CoDAE: Adapting Large Language Models for Education via Chain-of-Thought Data Augmentation (2025)
- User Feedback in Human-LLM Dialogues: A Lens to Understand Users But Noisy as a Learning Signal (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 2
Spaces citing this paper 0
No Space linking this paper