CodeBERT fine-tuned for Java Vulnerability Detection
CodeBERT model fine-tuned for detecting security vulnerabilities in Java code.
Model Description
This model is fine-tuned from microsoft/codebert-base for binary classification of secure/insecure Java code.
Intended Uses
- Detect security vulnerabilities in Java source code
- Binary classification: Safe (LABEL_0) vs Vulnerable (LABEL_1)
How to Use
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("mangsense/codebert_java")
model = AutoModelForSequenceClassification.from_pretrained("mangsense/codebert_java")
# run code
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
import numpy as np
tokenizer = AutoTokenizer.from_pretrained('mrm8488/codebert-base-finetuned-detect-insecure-code')
model = AutoModelForSequenceClassification.from_pretrained('mrm8488/codebert-base-finetuned-detect-insecure-code')
inputs = tokenizer("your code here", return_tensors="pt", truncation=True, padding='max_length')
labels = torch.tensor([1]).unsqueeze(0) # Batch size 1
outputs = model(**inputs, labels=labels)
loss = outputs.loss
logits = outputs.logits
print(np.argmax(logits.detach().numpy()))
Training Data
Trained on CodeXGLUE Defect Detection dataset.
Limitations
- Focused on Java code only
- May not detect all types of vulnerabilities
- Downloads last month
- 47