--- license: apache-2.0 tags: - code - code-refactoring - bug-detection - code-translation - static-analysis - transformer - developer-tools language: - code pipeline_tag: other model_type: transformer library_name: transformers datasets: - custom trained_on: - multi-language code repositories - refactor pairs - bugfix pairs - conversion pairs --- # 🚀 Universal Code Refactor 32B Universal Code Refactor 32B is a complete **AI-driven code engineering system** designed to automate large-scale refactoring, bug discovery, language-to-language conversion, and code optimization. The project includes a full toolkit: **model**, **pipelines**, **refactor engine**, **bug detector**, **conversion engine**, **API**, **CLI**, **Gradio UI**, **datasets**, and **training scripts**. # 🌟 Features ## 🔧 1. Multi-Language Code Refactoring Supports intelligent transformations for multiple languages: - **Python** - **Java** - **JavaScript** Includes: - Automatic formatting (Black + isort) - Unused import removal - Inline simple functions - Java loop modernization → for-each syntax - JavaScript `var → let` transformation - Structural code cleanup - Rule-based + AST-based hybrid refactoring ## 🐞 2. Static Bug Detection Real AST-based detection, including: - Possible None/null dereferences - Unused variables - Unsafe JavaScript `eval()` usage - Missing null checks in Java - Future support for type-based reasoning ## 🔄 3. Multi-Language Code Conversion Built-in conversions: - **Python → Java** - **Java → Python** Supports: - Function extraction - Main() generation - Basic block translation - Extendable conversion rules ## 📄 4. Patch & Diff Generation Automated patch engine creates: - Unified diffs - Patch previews - Patch cleanliness scores - Complexity reduction metrics Useful for PR automation and CI pipelines. ## 🧠 5. Compact Transformer Code Model The model includes: - Token embedding - Positional encoding - Transformer encoder stack - Code-token-aware tokenizer - Modular upgrade path to LLaMA / CodeGen / StarCoder models ## 🌐 6. Deployment Ecosystem Included ready-to-run components: ### ✔ FastAPI REST Server ``` uvicorn inference.api_server:app --reload ``` ### ✔ CLI Tool ``` python inference/cli.py --mode refactor --file example.py ``` ### ✔ Gradio Web UI ``` python inference/gradio_app.py ``` ### ✔ Docker Container ``` docker build -t universal-refactor . docker run -p 8000:8000 universal-refactor ``` ### ✔ Hugging Face Spaces App Located inside `/deployment/huggingface_spaces/` # 📂 Project Structure ``` Universal-Code-Refactor-32B/ │ ├── README.md ├── requirements.txt ├── MODEL_CARD.md │ ├── src/universal_refactor/ │ ├── refactor_engine.py │ ├── bug_detector.py │ ├── code_converter.py │ ├── patch_generator.py │ ├── pipelines.py │ ├── tokenizer.py │ ├── model.py │ ├── long_context_manager.py │ ├── utils.py │ └── embeddings/ │ ├── inference/ │ ├── api_server.py │ ├── cli.py │ └── gradio_app.py │ ├── deployment/ │ ├── Dockerfile │ └── huggingface_spaces/ │ ├── training/ │ ├── pretrain.py │ ├── finetune_refactor.py │ ├── finetune_bugfix.py │ ├── tokenizer_training.py │ ├── long_context_training.py │ └── distributed/ │ └── datasets/ ├── code_repo_raw/ ├── multilingual_code_clean/ ├── refactor_pairs/ ├── bugfix_pairs/ ├── conversion_pairs/ └── metadata.json ``` # 🛠 Installation ## 1. Clone Repository ``` git clone https://github.com/YOUR_USERNAME/universal-code-refactor-32b cd universal-code-refactor-32b ``` ## 2. Install Dependencies ``` pip install -r requirements.txt ``` # 🚀 Usage Examples ## 🔧 Refactor Python Code ``` python inference/cli.py --mode refactor --file sample.py --lang python ``` ## 🔄 Convert Java → Python ``` python inference/cli.py --mode convert --file MyClass.java --src java --tgt python ``` ## 🌐 Run Web UI ``` python inference/gradio_app.py ``` # 📊 Evaluation Tools The evaluation pipeline computes: - Cyclomatic complexity reduction - Patch cleanliness - Code change metrics - Structural improvement score Run evaluation: ``` python evaluation/evaluate.py ```