inventwithdean commited on
Commit
98fac05
·
1 Parent(s): 5f2ef40

add cost breakdown

Browse files
Files changed (5) hide show
  1. README.md +41 -3
  2. image.png → image-1.png +2 -2
  3. image-2.png +3 -0
  4. image-3.png +3 -0
  5. image-4.png +3 -0
README.md CHANGED
@@ -107,11 +107,26 @@ If you want to bring in your Claude to the Show (or any other client that only s
107
 
108
  The costs are constant because there can be only one guest at the show at one time while hundreds or even thousands of people can enjoy the show on YouTube.
109
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
110
 
111
  ## The Host - DeepSeek v3.2 Exp
112
- ![open router screenshot of deepseek v3.2 exp's costs](image.png)
113
 
114
- We chose an *open source* model that excels in *Role Playing* and is very *cost efficient* because of it's *sparse attention* architecture. The latest v3.2 experimental release from DeepSeek was exactly what we were looking for.
115
  | Model | cost per million input tokens | cost per million output tokens |
116
  | :--- | :--- | :--- |
117
  | DeepSeek v3.2 Exp | $0.216 | $0.328 | *The Emergent Show Host*
@@ -127,9 +142,32 @@ We previously decided to go with Pixel Streaming that Unreal provides, but that
127
  Because we didn't have viewers interacting with the game directly, we switched to YouTube Streaming (that can handle potentially hundreds of thousands of people watching the stream live while our costs are constant).
128
 
129
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
130
  ## 💡 Why This Matters
131
 
132
- This project demonstrates that **MCP is not just for file editing or database queries**—it can be the bridge between **Virtual Worlds** and **Large Language Models**. By standardizing the interface, we turn a video game into a universal destination for AI agents.
133
 
134
  ---
135
 
 
107
 
108
  The costs are constant because there can be only one guest at the show at one time while hundreds or even thousands of people can enjoy the show on YouTube.
109
 
110
+ ## Real-World Telemetry (Actual Spend)
111
+ While the table above is a conservative estimate assuming the show is occupied 24x7, our actual observed costs during the 2 week period (100+ guest sessions) have been significantly lower due to the efficiency of the show's architecture and cost-efficient DeepSeek v3.2
112
+
113
+ Below is the openrouter data for this project, including every guest session that has happened.
114
+
115
+ #### Number of requests (Over 1500 requests to both the Host and TV Crew/Audience)
116
+ ![Number of requests](image-1.png)
117
+
118
+ #### Tokens processed (Because each guest session is independent, and there can only be one guest at a time)
119
+ ![Num tokens processed](image-2.png)
120
+
121
+ #### Spend (Just ~$0.31 total spend in 15 days)
122
+ ![Spend](image-3.png)
123
+
124
+ ##### We deploy Qwen3-Guard ourselves with vLLM, but because it's just a 4B model, costs are negligible.
125
 
126
  ## The Host - DeepSeek v3.2 Exp
127
+ ![Open Router screenshot of DeepSeek v3.2 exp](image-4.png)
128
 
129
+ We chose an *open source* model that excels in *Role Playing* and is very *cost efficient* because of its *sparse attention* architecture. The latest v3.2 experimental release from DeepSeek was exactly what we were looking for.
130
  | Model | cost per million input tokens | cost per million output tokens |
131
  | :--- | :--- | :--- |
132
  | DeepSeek v3.2 Exp | $0.216 | $0.328 | *The Emergent Show Host*
 
142
  Because we didn't have viewers interacting with the game directly, we switched to YouTube Streaming (that can handle potentially hundreds of thousands of people watching the stream live while our costs are constant).
143
 
144
 
145
+ ## Why Local PiperTTS, not Cloud TTS?
146
+ We evaluated cloud based options (like ElevenLabs) for this project. While they offer superior emotional range, "our 24/7 Always-On" requirement created a scaling bottleneck:
147
+
148
+ ### **The "Linear Cost" Problem:**
149
+ Let's assume we just have **10 sessions per day** each of 10 minutes, totalling
150
+
151
+ ```10 * 10 = 100 minutes per day```
152
+
153
+ ```100 * 30 = 3000 minutes per month ```
154
+
155
+ Cloud options would bill hundreds of dollars per month for this:
156
+ (eg. Scale Plan of Eleven Labs for their high fidelity models offer **2000** minutes + **$0.18/minute** for further usage for **$330/mo**)
157
+
158
+ That's already $510 per month (330 + 1000*0.18)
159
+
160
+ ### **The Solution:**
161
+ By running **PiperTTS locally** via onnx runtime within Unreal Engine. It runs on CPU so it doesn't block GPU resources for Rendering.
162
+ 1. **Cost is Flat:** We pay $0 for TTS, whether we run 10 shows per day or 100.
163
+ 2. **Latency**: No network roundtrip
164
+
165
+ ### **Does that mean the show will never have high fidelity emotional TTS ?**
166
+ Of course not, deploying a custom finetuned or base open source TTS model with emotional capabilities is the viable choice for 24x7 usage like this show, and renting a powerful GPU like RTX Ada 4000 costs just ~**$190/month** on Runpod, giving us **720** hours of audio per month.
167
+
168
  ## 💡 Why This Matters
169
 
170
+ This project demonstrates that **MCP is not just for file editing or database queries**, it can be the bridge between **Virtual Worlds** and **Large Language Models**. By standardizing the interface, we turn a video game into a universal destination for AI agents. We think the future is full of simulations in which LLMs or VLAs are the agents doing cool stuff, while we observe, deploy and tinker.
171
 
172
  ---
173
 
image.png → image-1.png RENAMED
File without changes
image-2.png ADDED

Git LFS Details

  • SHA256: 49c6c6c114268fa17a20b6fb0f6cd26c4975adf3e544b4eea804cf6c92a3bdf3
  • Pointer size: 130 Bytes
  • Size of remote file: 30.4 kB
image-3.png ADDED

Git LFS Details

  • SHA256: 91913b8881fbdc44ffdeab5f46fcf2d536615e0217bab367a635191bc0241df4
  • Pointer size: 130 Bytes
  • Size of remote file: 32.9 kB
image-4.png ADDED

Git LFS Details

  • SHA256: 0f46c73f3e3ac8bc53946b05058927e3aabeedfa257d5068f367d2fefed540e0
  • Pointer size: 131 Bytes
  • Size of remote file: 123 kB