Qwen2.5-7B-Instruct Fine-tuned for QED (Question-Explanation-Data)

Model Description

This model is a fine-tuned version of Qwen/Qwen2.5-7B-Instruct specialized for the QED (Question-Explanation-Data) explanatory question answering task. The model generates comprehensive explanations by simultaneously producing answers, evidence sentences, and entity relationship mappings.

Task Overview

The QED task, introduced in "A Framework and Dataset for Explanations in Question Answering", requires models to:

  1. Answer: The most concise text span that answers the question
  2. Selected Sentence: The sentence that provides evidence for the answer
  3. Referential Equalities: Entity mappings between question terms and passage references

This approach enables more interpretable and trustworthy question answering systems.

Fine-tuning Methodology

  • Base Model: Qwen/Qwen2.5-7B-Instruct
  • Technique: LoRA (Low-Rank Adaptation) with parameters r=16, Ξ±=32, dropout=0.05
  • Efficiency: 4-bit quantization with bfloat16 computation
  • Prompting: Few-shot learning with two randomly selected demonstration examples
  • Architecture: Structured JSON output generation with precise span extraction

Performance Results

Substantial improvements over the base Qwen2.5-7B-Instruct model:

Metric Base Model (Zero-shot) Fine-tuned Model Improvement
Exact Match Accuracy 3.4% 12.4% +9.0%
Answer Accuracy 84.1% 89.9% +5.8%
All Mention F1 12.0% 30.0% +18.0%
Question Mention F1 12.9% 36.3% +23.4%
Context Mention F1 11.3% 23.8% +12.5%

Metrics evaluated on QED dev set with 0.5 F1 overlap threshold

Training Code & Methodology

This model was trained using comprehensive QED fine-tuning framework available on GitHub:

πŸ”— QED Fine-Tuning Framework

Key Strengths

  • Answer Accuracy: Achieves nearly 90% accuracy in answer span identification
  • Entity Resolution: Strong performance in mapping question entities to passage references
  • Structured Output: Reliable JSON generation following the QED annotation schema
  • Generalization: Robust performance across different question types and domains

Model Usage

Input your question and passage following the QED instruction format:

# Required input structure
input_text = f"""
Title: {document_title}
Question: {question}
Passage: {context_passage}

You are an expert at extracting answers and structured explanations from text.
Your response MUST be **valid JSON only** (no extra commentary).

Task
====
Given:
β€’ a **title** for the passage,
β€’ a **question** about the passage, and
β€’ the **context passage** itself,

produce an explanation object with three parts:

1. "answer" – the **shortest span** from the passage that fully answers the question.
2. "selected_sentence" – the **single sentence** in the passage that entails or implies the answer.
3. "referential_equalities" – a list of mappings between phrases in the question and phrases in the selected sentence
   that refer to the **same real-world entity/event**.

   β€’ Each mapping has two keys:
       - "question_reference": the exact phrase from the question (**must be a contiguous substring from the question,
          not from the context or title**).
       - "sentence_reference": the exact phrase from the selected sentence (**must be a contiguous substring from the selected sentence,
          not from the question or title**), or "" (empty string if the entire sentence is the referent).

     β–Έ Use **""** for "sentence_reference" when the entity/event is not named by any specific phrase in the sentence –
       i.e. the entire sentence acts as the referent (a *bridge* to the whole sentence).  
       This corresponds to the (start = end = -1) convention in the QED dataset.

Output format
=============
Return **only** JSON in this exact schema:

{
  "answer": "<string from passage>",
  "selected_sentence": "<string from passage>",
  "referential_equalities": [
    {
      "question_reference": "<string from question only>",
      "sentence_reference": "<string from selected_sentence only, or "">",
      "bridge": "<false if not a bridge; otherwise, a string explaining the bridge connection, e.g., 'in', 'for', 'of', 'at', 'on'>"
    }
    ...
  ]
}
"""

# Model outputs structured explanations in JSON format
{
  "answer": "extracted answer span",
  "selected_sentence": "supporting evidence sentence", 
  "referential_equalities": [
    {
      "question_reference": "entity from question",
      "sentence_reference": "entity from passage",
      "bridge": false
    }
  ]
}

Training Details

  • Dataset: QED training subset with careful example curation
  • Learning Rate: 5e-6 with warmup ratio of 0.2
  • Batch Size: Effective batch size of 16 through gradient accumulation
  • Optimizer: Paged AdamW 8-bit for memory efficiency
  • Evaluation: Multi-threshold validation (0.5-0.9 F1 overlap)
  • Epochs: 3 Epochs

Applications

This model is particularly suitable for:

  • Educational question answering systems requiring explanations
  • Research applications needing interpretable QA
  • Systems where answer provenance and entity tracking are important
  • Building more transparent and accountable AI assistants

Citation

Please cite the original QED work when using this model:

@article{lamm2020qed,
  title={QED: A Framework and Dataset for Explanations in Question Answering},
  author={Lamm, Matthew and Palomaki, Jennimaria and Alberti, Chris and Andor, Daniel and Chen, Eunsol and Devlin, Jacob and Michael, Julian},
  journal={arXiv preprint arXiv:2010.13806},
  year={2020}
}
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for DenisRz/qwen2.5-7b-qed

Base model

Qwen/Qwen2.5-7B
Adapter
(705)
this model

Dataset used to train DenisRz/qwen2.5-7b-qed