---
license: apache-2.0
tags:
  - code
  - code-refactoring
  - bug-detection
  - code-translation
  - static-analysis
  - transformer
  - developer-tools
language:
  - code
pipeline_tag: other
model_type: transformer
library_name: transformers
datasets:
  - custom
trained_on:
  - multi-language code repositories
  - refactor pairs
  - bugfix pairs
  - conversion pairs
---

# 🚀 Universal Code Refactor 32B  

Universal Code Refactor 32B is a complete **AI-driven code engineering system** designed to automate large-scale refactoring, bug discovery, language-to-language conversion, and code optimization.  
The project includes a full toolkit: **model**, **pipelines**, **refactor engine**, **bug detector**, **conversion engine**, **API**, **CLI**, **Gradio UI**, **datasets**, and **training scripts**.


# 🌟 Features

## 🔧 1. Multi-Language Code Refactoring  
Supports intelligent transformations for multiple languages:

- **Python**
- **Java**
- **JavaScript**

Includes:
- Automatic formatting (Black + isort)
- Unused import removal
- Inline simple functions
- Java loop modernization → for-each syntax
- JavaScript `var → let` transformation
- Structural code cleanup  
- Rule-based + AST-based hybrid refactoring


## 🐞 2. Static Bug Detection  
Real AST-based detection, including:

- Possible None/null dereferences  
- Unused variables  
- Unsafe JavaScript `eval()` usage  
- Missing null checks in Java  
- Future support for type-based reasoning


## 🔄 3. Multi-Language Code Conversion  
Built-in conversions:

- **Python → Java**
- **Java → Python**

Supports:
- Function extraction  
- Main() generation  
- Basic block translation  
- Extendable conversion rules  


## 📄 4. Patch & Diff Generation  
Automated patch engine creates:

- Unified diffs  
- Patch previews  
- Patch cleanliness scores  
- Complexity reduction metrics  

Useful for PR automation and CI pipelines.


## 🧠 5. Compact Transformer Code Model  
The model includes:

- Token embedding  
- Positional encoding  
- Transformer encoder stack  
- Code-token-aware tokenizer  
- Modular upgrade path to LLaMA / CodeGen / StarCoder models  


## 🌐 6. Deployment Ecosystem  
Included ready-to-run components:

### ✔ FastAPI REST Server  
```
uvicorn inference.api_server:app --reload
```

### ✔ CLI Tool  
```
python inference/cli.py --mode refactor --file example.py
```

### ✔ Gradio Web UI  
```
python inference/gradio_app.py
```

### ✔ Docker Container  
```
docker build -t universal-refactor .
docker run -p 8000:8000 universal-refactor
```

### ✔ Hugging Face Spaces App  
Located inside `/deployment/huggingface_spaces/`


# 📂 Project Structure

```
Universal-Code-Refactor-32B/
│
├── README.md
├── requirements.txt
├── MODEL_CARD.md
│
├── src/universal_refactor/
│   ├── refactor_engine.py
│   ├── bug_detector.py
│   ├── code_converter.py
│   ├── patch_generator.py
│   ├── pipelines.py
│   ├── tokenizer.py
│   ├── model.py
│   ├── long_context_manager.py
│   ├── utils.py
│   └── embeddings/
│
├── inference/
│   ├── api_server.py
│   ├── cli.py
│   └── gradio_app.py
│
├── deployment/
│   ├── Dockerfile
│   └── huggingface_spaces/
│
├── training/
│   ├── pretrain.py
│   ├── finetune_refactor.py
│   ├── finetune_bugfix.py
│   ├── tokenizer_training.py
│   ├── long_context_training.py
│   └── distributed/
│
└── datasets/
    ├── code_repo_raw/
    ├── multilingual_code_clean/
    ├── refactor_pairs/
    ├── bugfix_pairs/
    ├── conversion_pairs/
    └── metadata.json
```


# 🛠 Installation

## 1. Clone Repository  
```
git clone https://github.com/YOUR_USERNAME/universal-code-refactor-32b
cd universal-code-refactor-32b
```

## 2. Install Dependencies  
```
pip install -r requirements.txt
```


# 🚀 Usage Examples

## 🔧 Refactor Python Code  
```
python inference/cli.py --mode refactor --file sample.py --lang python
```

## 🔄 Convert Java → Python  
```
python inference/cli.py --mode convert --file MyClass.java --src java --tgt python
```

## 🌐 Run Web UI  
```
python inference/gradio_app.py
```


# 📊 Evaluation Tools

The evaluation pipeline computes:

- Cyclomatic complexity reduction  
- Patch cleanliness  
- Code change metrics  
- Structural improvement score  

Run evaluation:
```
python evaluation/evaluate.py
```