Improve model card: Add pipeline tag, library, code link, and usage (#1)

Browse files

- Improve model card: Add pipeline tag, library, code link, and usage (ad41797ee0b2adec9ba6bf8f606bc4fc4ef00d1b)

Co-authored-by: Niels Rogge <[email protected]>

Files changed (1) hide show

README.md +25 -3

README.md CHANGED Viewed

@@ -1,16 +1,20 @@
 ---
-license: mit
 datasets:
 - luzimu/webgen-agent_train_step-grpo
 - luzimu/webgen-agent_train_sft
-base_model:
-- Qwen/Qwen2.5-Coder-7B-Instruct
 ---
 # WebGen-Agent
 WebGen-Agent is an advanced website generation agent designed to autonomously create websites from natural language instructions. It was introduced in the paper [WebGen-Agent: Enhancing Interactive Website Generation with Multi-Level Feedback and Step-Level Reinforcement Learning](https://arxiv.org/pdf/2509.22644v1).
 ## Project Overview
 WebGen-Agent combines state-of-the-art language models with specialized training techniques to create a powerful website generation tool. The agent can understand natural language instructions specifying appearance and functional requirements, iteratively generate website codebases, and refine them using visual and functional feedback.
@@ -55,6 +59,24 @@ These dual rewards provide dense, reliable process supervision that significantl
 ![Step-GRPO with Screenshot and GUI-agent Feedback](fig/step-grpo.png)
 ## Citation
 If you find our project useful, please cite:

 ---
+base_model:
+- Qwen/Qwen2.5-Coder-7B-Instruct
 datasets:
 - luzimu/webgen-agent_train_step-grpo
 - luzimu/webgen-agent_train_sft
+license: mit
+pipeline_tag: image-text-to-text
+library_name: transformers
 ---
 # WebGen-Agent
 WebGen-Agent is an advanced website generation agent designed to autonomously create websites from natural language instructions. It was introduced in the paper [WebGen-Agent: Enhancing Interactive Website Generation with Multi-Level Feedback and Step-Level Reinforcement Learning](https://arxiv.org/pdf/2509.22644v1).
+Code: https://github.com/mnluzimu/WebGen-Agent
 ## Project Overview
 WebGen-Agent combines state-of-the-art language models with specialized training techniques to create a powerful website generation tool. The agent can understand natural language instructions specifying appearance and functional requirements, iteratively generate website codebases, and refine them using visual and functional feedback.
 ![Step-GRPO with Screenshot and GUI-agent Feedback](fig/step-grpo.png)
+## Sample Usage
+Before running inference, you should rename `.env.template` to `.env` and set the base urls and api keys for the agent-engine LLM and feedback VLM. They can be obtained from any openai-compatible providers such as [openrouter](https://openrouter.ai/), [modelscope](https://www.modelscope.cn/my/overview), [bailian](https://bailian.console.aliyun.com/#/home), and [llmprovider](https://llmprovider.ai/).
+You can also deploy open-source VLMs and LLMs by running `src/scripts/deploy_qwenvl_32b.sh` and `src/scripts/deploy.sh`. Scripts for single inference and batch inference can be found at `src/scripts/infer_single.sh` and `src/scripts/infer_batch.sh`.
+```bash
+python src/infer_single.py \
+    --model deepseek-chat \
+    --vlm_model Qwen/Qwen2.5-VL-32B-Instruct \
+    --instruction "Please implement a wheel of fortune website." \
+    --workspace-dir workspaces_root/test \
+    --log-dir service_logs/test \
+    --max-iter 20 \
+    --overwrite \
+    --error-limit 5
+```
 ## Citation
 If you find our project useful, please cite: