Prompt Injection DeBERTa
finetuned DeBERTa-based prompt injection detection
Build secure, reliable, and long-term AI systems focused on safety, reasoning, and developer tooling.
AI Security · Autonomous Systems · LLM Safety
Independent research lab building open datasets, models, and frameworks for LLM security and autonomous evaluation.
AI In The Loop (AITL): A Systems Taxonomy for Closed-Loop Autonomous Evaluation Sanskar Jajoo · Neuralchemy Labs · 2026 zenodo.org/records/19551173
The Autonomous Sunk-Cost Fallacy: Stopping Failures and Meta-Reasoning in LLMs Deployed within AEOS Sanskar Jajoo · Neuralchemy Labs · 2026 zenodo.org/records/19846960
Prompt Injection Threat Matrix 32,320 samples · 7 intent classes · 10 severity levels · Full threat schema View Dataset
Prompt Injection Dataset 6,000+ samples · Benign vs malicious · Real-world attack scenarios View Dataset
DistilBERT Threat Matrix Classifier 99.4% F1 · Prompt intent classification · High-speed inference View Model
DeBERTa Prompt Injection Classifier Transformer-based injection detection View Model
Classical ML Detector Lightweight RF/LR classifiers for offline/legacy deployment View Model
Try our prompt injection classifier: Prompt-injection-DeBERTa Space
NeuralAlchemy is an independent AI security research lab based in India. We build open datasets, train security models, and publish research on LLM behavioral failures and autonomous evaluation systems.
Website: neuralchemy.in GitHub: github.com/m4vic Contact: Via GitHub or neuralchemy.in