Improve model card: Add pipeline tag, library, code link, and usage

This PR enhances the model card by:
- Adding `pipeline_tag: image-text-to-text` to the metadata for better discoverability.
- Specifying `library_name: transformers` in the metadata, as the model's configuration (`model_type: qwen2`, `tokenizer_class: Qwen2Tokenizer`) indicates compatibility with the Hugging Face Transformers library. This will enable automated code snippets on the Hub.
- Adding a direct link to the GitHub repository in the main content.
- Including a "Sample Usage" section with a "Single Inference" code snippet, directly taken from the GitHub README, to help users get started easily.

All existing content and metadata, including the current arXiv paper link, have been preserved.

Files changed (1) hide show

README.md +25 -3

README.md CHANGED Viewed

@@ -1,16 +1,20 @@
 ---
-license: mit
 datasets:
 - luzimu/webgen-agent_train_step-grpo
 - luzimu/webgen-agent_train_sft
-base_model:
-- Qwen/Qwen2.5-Coder-7B-Instruct
 ---
 # WebGen-Agent
 WebGen-Agent is an advanced website generation agent designed to autonomously create websites from natural language instructions. It was introduced in the paper [WebGen-Agent: Enhancing Interactive Website Generation with Multi-Level Feedback and Step-Level Reinforcement Learning](https://arxiv.org/pdf/2509.22644v1).
 ## Project Overview
 WebGen-Agent combines state-of-the-art language models with specialized training techniques to create a powerful website generation tool. The agent can understand natural language instructions specifying appearance and functional requirements, iteratively generate website codebases, and refine them using visual and functional feedback.
@@ -55,6 +59,24 @@ These dual rewards provide dense, reliable process supervision that significantl
 ![Step-GRPO with Screenshot and GUI-agent Feedback](fig/step-grpo.png)
 ## Citation
 If you find our project useful, please cite:

 ---
+base_model:
+- Qwen/Qwen2.5-Coder-7B-Instruct
 datasets:
 - luzimu/webgen-agent_train_step-grpo
 - luzimu/webgen-agent_train_sft
+license: mit
+pipeline_tag: image-text-to-text
+library_name: transformers
 ---
 # WebGen-Agent
 WebGen-Agent is an advanced website generation agent designed to autonomously create websites from natural language instructions. It was introduced in the paper [WebGen-Agent: Enhancing Interactive Website Generation with Multi-Level Feedback and Step-Level Reinforcement Learning](https://arxiv.org/pdf/2509.22644v1).
+Code: https://github.com/mnluzimu/WebGen-Agent
 ## Project Overview
 WebGen-Agent combines state-of-the-art language models with specialized training techniques to create a powerful website generation tool. The agent can understand natural language instructions specifying appearance and functional requirements, iteratively generate website codebases, and refine them using visual and functional feedback.
 ![Step-GRPO with Screenshot and GUI-agent Feedback](fig/step-grpo.png)
+## Sample Usage
+Before running inference, you should rename `.env.template` to `.env` and set the base urls and api keys for the agent-engine LLM and feedback VLM. They can be obtained from any openai-compatible providers such as [openrouter](https://openrouter.ai/), [modelscope](https://www.modelscope.cn/my/overview), [bailian](https://bailian.console.aliyun.com/#/home), and [llmprovider](https://llmprovider.ai/).
+You can also deploy open-source VLMs and LLMs by running `src/scripts/deploy_qwenvl_32b.sh` and `src/scripts/deploy.sh`. Scripts for single inference and batch inference can be found at `src/scripts/infer_single.sh` and `src/scripts/infer_batch.sh`.
+```bash
+python src/infer_single.py \
+    --model deepseek-chat \
+    --vlm_model Qwen/Qwen2.5-VL-32B-Instruct \
+    --instruction "Please implement a wheel of fortune website." \
+    --workspace-dir workspaces_root/test \
+    --log-dir service_logs/test \
+    --max-iter 20 \
+    --overwrite \
+    --error-limit 5
+```
 ## Citation
 If you find our project useful, please cite: