|
|
--- |
|
|
pipeline_tag: other |
|
|
language: en |
|
|
library_name: pytorch |
|
|
license: apache-2.0 |
|
|
tags: |
|
|
- music |
|
|
- midi |
|
|
- mir |
|
|
- deduplication |
|
|
- caugbert |
|
|
model-index: |
|
|
- name: LMD Deduplication - CAugBERT |
|
|
results: |
|
|
- task: |
|
|
type: representation-learning |
|
|
name: symbolic music representation learning |
|
|
dataset: |
|
|
type: midi |
|
|
name: Lakh MIDI Dataset |
|
|
metrics: |
|
|
- type: F1 |
|
|
value: 0.493 |
|
|
--- |
|
|
|
|
|
# LMD Deduplication Supplements |
|
|
This repository provides the pre-trained CAugBERT model checkpoint used in: |
|
|
**"On the De-duplication of the Lakh MIDI Dataset" (ISMIR 2025)** |
|
|
[[Paper]](https://ismir2025program.ismir.net/poster_188.html) | [[GitHub Code]](https://github.com/jech2/LMD_Deduplication) |
|
|
|
|
|
--- |
|
|
|
|
|
# Usage |
|
|
You can either integrate this checkpoint into the main repository for inference, or load it directly: |
|
|
```bash |
|
|
# Option 1: Run inference in the main repo |
|
|
poetry run python inference.py # make sure yamls/inference.yaml paths are correct |
|
|
``` |
|
|
```python |
|
|
# Option 2: Load checkpoint manually |
|
|
import torch |
|
|
from contrastive_musicbert.model.BERT import BERT_Lightning |
|
|
|
|
|
model = BERT_Lightning(...).to(device) # see .hydra/config.yaml for arguments |
|
|
checkpoint = torch.load(checkpoint_path, map_location="cpu") |
|
|
model.load_state_dict(checkpoint['state_dict']) |
|
|
``` |
|
|
|
|
|
# Note |
|
|
If you have any questions regarding the checkpoint, please contact: |
|
|
Eunjin Choi ([email protected]) |