Image-Text-to-Text
Transformers
Safetensors
qwen2
text-generation
conversational
text-generation-inference
luzimu nielsr HF Staff commited on
Commit
3874ec8
·
verified ·
1 Parent(s): 58b01a9

Improve model card: Add pipeline tag, library, code link, and usage (#1)

Browse files

- Improve model card: Add pipeline tag, library, code link, and usage (ad41797ee0b2adec9ba6bf8f606bc4fc4ef00d1b)


Co-authored-by: Niels Rogge <[email protected]>

Files changed (1) hide show
  1. README.md +25 -3
README.md CHANGED
@@ -1,16 +1,20 @@
1
  ---
2
- license: mit
 
3
  datasets:
4
  - luzimu/webgen-agent_train_step-grpo
5
  - luzimu/webgen-agent_train_sft
6
- base_model:
7
- - Qwen/Qwen2.5-Coder-7B-Instruct
 
8
  ---
9
 
10
  # WebGen-Agent
11
 
12
  WebGen-Agent is an advanced website generation agent designed to autonomously create websites from natural language instructions. It was introduced in the paper [WebGen-Agent: Enhancing Interactive Website Generation with Multi-Level Feedback and Step-Level Reinforcement Learning](https://arxiv.org/pdf/2509.22644v1).
13
 
 
 
14
  ## Project Overview
15
 
16
  WebGen-Agent combines state-of-the-art language models with specialized training techniques to create a powerful website generation tool. The agent can understand natural language instructions specifying appearance and functional requirements, iteratively generate website codebases, and refine them using visual and functional feedback.
@@ -55,6 +59,24 @@ These dual rewards provide dense, reliable process supervision that significantl
55
 
56
  ![Step-GRPO with Screenshot and GUI-agent Feedback](fig/step-grpo.png)
57
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
58
  ## Citation
59
 
60
  If you find our project useful, please cite:
 
1
  ---
2
+ base_model:
3
+ - Qwen/Qwen2.5-Coder-7B-Instruct
4
  datasets:
5
  - luzimu/webgen-agent_train_step-grpo
6
  - luzimu/webgen-agent_train_sft
7
+ license: mit
8
+ pipeline_tag: image-text-to-text
9
+ library_name: transformers
10
  ---
11
 
12
  # WebGen-Agent
13
 
14
  WebGen-Agent is an advanced website generation agent designed to autonomously create websites from natural language instructions. It was introduced in the paper [WebGen-Agent: Enhancing Interactive Website Generation with Multi-Level Feedback and Step-Level Reinforcement Learning](https://arxiv.org/pdf/2509.22644v1).
15
 
16
+ Code: https://github.com/mnluzimu/WebGen-Agent
17
+
18
  ## Project Overview
19
 
20
  WebGen-Agent combines state-of-the-art language models with specialized training techniques to create a powerful website generation tool. The agent can understand natural language instructions specifying appearance and functional requirements, iteratively generate website codebases, and refine them using visual and functional feedback.
 
59
 
60
  ![Step-GRPO with Screenshot and GUI-agent Feedback](fig/step-grpo.png)
61
 
62
+ ## Sample Usage
63
+
64
+ Before running inference, you should rename `.env.template` to `.env` and set the base urls and api keys for the agent-engine LLM and feedback VLM. They can be obtained from any openai-compatible providers such as [openrouter](https://openrouter.ai/), [modelscope](https://www.modelscope.cn/my/overview), [bailian](https://bailian.console.aliyun.com/#/home), and [llmprovider](https://llmprovider.ai/).
65
+
66
+ You can also deploy open-source VLMs and LLMs by running `src/scripts/deploy_qwenvl_32b.sh` and `src/scripts/deploy.sh`. Scripts for single inference and batch inference can be found at `src/scripts/infer_single.sh` and `src/scripts/infer_batch.sh`.
67
+
68
+ ```bash
69
+ python src/infer_single.py \
70
+ --model deepseek-chat \
71
+ --vlm_model Qwen/Qwen2.5-VL-32B-Instruct \
72
+ --instruction "Please implement a wheel of fortune website." \
73
+ --workspace-dir workspaces_root/test \
74
+ --log-dir service_logs/test \
75
+ --max-iter 20 \
76
+ --overwrite \
77
+ --error-limit 5
78
+ ```
79
+
80
  ## Citation
81
 
82
  If you find our project useful, please cite: