Papers
		
		
	
	arxiv:2108.04631
		Megadiff: A Dataset of 600k Java Source Code Changes Categorized by Diff Size
Published on Aug 10, 2021
		Authors:
		
			
			
			
			
			
			
		
Abstract
This paper presents Megadiff, a dataset of source code diffs. It focuses on Java, with strict inclusion criteria based on commit message and diff size. Megadiff contains 663 029 Java diffs that can be used for research on commit comprehension, fault localization, automated program repair, and machine learning on code changes.
Models citing this paper 0
No model linking this paper
Cite arxiv.org/abs/2108.04631 in a model README.md to link it from this page.
				
			Datasets citing this paper 2
Spaces citing this paper 0
No Space linking this paper
Cite arxiv.org/abs/2108.04631 in a Space README.md to link it from this page.
				
			Collections including this paper 0
No Collection including this paper
Add this paper to a
					collection
					to link it from this page.