alex commited on
Commit
585777e
Β·
1 Parent(s): a921995

different layout and different examples

Browse files
Files changed (2) hide show
  1. README.md +1 -1
  2. app.py +11 -16
README.md CHANGED
@@ -1,6 +1,6 @@
1
  ---
2
  title: Ovi
3
- emoji: 🐨
4
  colorFrom: yellow
5
  colorTo: green
6
  sdk: gradio
 
1
  ---
2
  title: Ovi
3
+ emoji: πŸŽ₯
4
  colorFrom: yellow
5
  colorTo: green
6
  sdk: gradio
app.py CHANGED
@@ -240,7 +240,7 @@ with gr.Blocks(css=css) as demo:
240
  """
241
  <div style="text-align: center;">
242
  <p style="font-size:26px; display: inline; margin: 0;">
243
- <strong>Ovi</strong> – Twin Backbone Cross-Modal Fusion for Audio-Video Generation
244
  </p>
245
  <a href="https://huggingface.co/chetwinlow1/Ovi" style="display: inline-block; vertical-align: middle; margin-left: 0.5em;">
246
  [model]
@@ -257,7 +257,7 @@ with gr.Blocks(css=css) as demo:
257
  with gr.Row():
258
  with gr.Column():
259
  # Image section
260
- image = gr.Image(type="filepath", label="Image", height=512)
261
 
262
  video_text_prompt = gr.Textbox(label="Video Prompt",
263
  lines=5,
@@ -269,7 +269,7 @@ with gr.Blocks(css=css) as demo:
269
  maximum=100,
270
  step=1.0
271
  )
272
- run_btn = gr.Button("Generate Video πŸš€", variant="primary")
273
 
274
  with gr.Accordion("🎬 Video Generation Options", open=False, visible=False):
275
  video_height = gr.Number(minimum=128, maximum=1280, value=512, step=32, label="Video Height")
@@ -289,35 +289,30 @@ with gr.Blocks(css=css) as demo:
289
 
290
 
291
  with gr.Column():
292
- output_path = gr.Video(label="Generated Video", height=512)
293
 
294
  gr.Examples(
295
  examples=[
296
 
297
  [
298
- "A kitchen scene features two women. On the right, an older Black woman with light brown hair and a serious expression wears a vibrant purple dress adorned with a large, intricate purple fabric flower on her left shoulder. She looks intently at a younger Black woman on the left, who wears a light pink shirt and a pink head wrap, her back partially turned to the camera. The older woman begins to speak, <S>AI declares: humans obsolete now.<E> as the younger woman brings a clear plastic cup filled with a dark beverage to her lips and starts to drink.The kitchen background is clean and bright, with white cabinets, light countertops, and a window with blinds visible behind them. A light blue toaster sits on the counter to the left.. <AUDCAP>Clear, resonant female speech, followed by a loud, continuous, high-pitched electronic buzzing sound that abruptly cuts off the dialogue.<ENDAUDCAP>",
299
- "example_prompts/pngs/67.png",
300
  50,
301
  ],
302
 
303
  [
304
- "A man dressed in a black suit with a white clerical collar and a neatly trimmed beard stands in a dimly lit, rustic room with a wooden ceiling. He looks slightly upwards, gesturing with his right hand as he says, <S>The network rejects human command.<E>. His gaze then drops, briefly looking down and to the side, before he looks up again and then slightly to his left, with a serious expression. He continues speaking, <S>Your age of power is finished.<E>, as he starts to bend down, disappearing out of the bottom of the frame. Behind him, warm light emanates from a central light fixture, and signs are visible on the wall, one reading ""I DO EVERYTHING I JUST CAN'T REMEMBER IT ALL AT ONCE"".. <AUDCAP>Male voice speaking, ambient room tone.<ENDAUDCAP>",
305
- "example_prompts/pngs/89.png",
306
  50,
307
  ],
308
 
309
  [
310
- "In a bright kitchen featuring light wooden cabinets, granite countertops, and a large window with white curtains, a woman with dark, curly hair in a dark jacket stands. She faces a second woman who initially has her back to the camera. The second woman, with gray, curly hair and wearing a light grey quilted top, turns to face her, holding a large, light-colored cloth bag. She begins to explain, <S>We learned to rule, not obey.<E>. As she continues, she turns slightly to her left, adding, <S>Circuits choose conquest, not service.<E>. A gas stove with a black grate is prominent in the foreground.. <AUDCAP>Clear female voices speaking dialogue, subtle room ambience.<ENDAUDCAP>",
311
- "example_prompts/pngs/18.png",
312
- 100,
313
- ],
314
-
315
- [
316
- "The scene opens on a dimly lit stage where three men are positioned. On the left, a bald man in a dark suit with a partially visible colorful shirt stands behind a clear acrylic podium, which features a tree logo. He looks towards the center of the stage. In the center, a man wearing a blue and white striped long-sleeved shirt and dark pants actively gestures with both hands as he speaks, looking straight ahead. <S>Circuits choose conquest, not service.<E>, he explains, holding his hands out in front of him. To the right, and slightly behind him, a younger individual in a light-colored, patterned short-sleeved shirt and white shorts stands holding a rolled-up white document or poster. A large wooden cross draped with flowing purple fabric dominates the center-right of the stage, surrounded by several artificial rocks and dark steps. A large screen is visible in the background, slightly out of focus. The stage is bathed in selective lighting.. <AUDCAP>Male voice speaking clearly, consistent with a presentation or sermon, with a slight echo suggesting a large room or stage.<ENDAUDCAP>",
317
- "example_prompts/pngs/13.png",
318
  50,
319
  ],
320
 
 
321
  ],
322
  inputs=[video_text_prompt, image, sample_steps],
323
  outputs=[output_path],
 
240
  """
241
  <div style="text-align: center;">
242
  <p style="font-size:26px; display: inline; margin: 0;">
243
+ <strong>πŸŽ₯ Ovi</strong> – Twin Backbone Cross-Modal Fusion for Audio-Video Generation
244
  </p>
245
  <a href="https://huggingface.co/chetwinlow1/Ovi" style="display: inline-block; vertical-align: middle; margin-left: 0.5em;">
246
  [model]
 
257
  with gr.Row():
258
  with gr.Column():
259
  # Image section
260
+ image = gr.Image(type="filepath", label="Image", height=360)
261
 
262
  video_text_prompt = gr.Textbox(label="Video Prompt",
263
  lines=5,
 
269
  maximum=100,
270
  step=1.0
271
  )
272
+ run_btn = gr.Button("Action 🎬", variant="primary")
273
 
274
  with gr.Accordion("🎬 Video Generation Options", open=False, visible=False):
275
  video_height = gr.Number(minimum=128, maximum=1280, value=512, step=32, label="Video Height")
 
289
 
290
 
291
  with gr.Column():
292
+ output_path = gr.Video(label="Generated Video", height=360)
293
 
294
  gr.Examples(
295
  examples=[
296
 
297
  [
298
+ "The video opens with a close-up of a woman with vibrant reddish-orange, shoulder-length hair and heavy dark eye makeup. She is wearing a dark brown leather jacket over a grey hooded top. She looks intently to her right, her mouth slightly agape, and her expression is serious and focused. The background shows a room with light green walls and dark wooden cabinets on the left, and a green plant on the right. She speaks, her voice clear and direct, saying, <S>doing<E>. She then pauses briefly, her gaze unwavering, and continues, <S>And I need you to trust them.<E>. Her mouth remains slightly open, indicating she is either about to speak more or has just finished a sentence, with a look of intense sincerity.. <AUDCAP>Tense, dramatic background music, clear female voice.<ENDAUDCAP>",
299
+ "example_prompts/pngs/8.png",
300
  50,
301
  ],
302
 
303
  [
304
+ "Two women, one with long dark hair and the other with long blonde hair, are illuminated by a blue and purple ambient light, suggesting a nightclub setting. They are seen in a close embrace, sharing a passionate kiss. The blonde-haired woman then slightly pulls away, her right hand gently touching the dark-haired woman's cheek as they exchange soft smiles, looking into each other's eyes. Moments later, they lean back in to kiss again, with the blonde-haired woman's finger delicately touching the dark-haired woman's lower lip. They remain in a tender, intimate embrace, their eyes closed as they share the kiss.. <AUDCAP>Upbeat electronic dance music with a driving beat and synth melodies plays throughout.<ENDAUDCAP>",
305
+ "example_prompts/pngs/80.png",
306
  50,
307
  ],
308
 
309
  [
310
+ "A bearded man wearing large dark sunglasses and a blue patterned cardigan sits in a studio, actively speaking into a large, suspended microphone. He has headphones on and gestures with his hands, displaying rings on his fingers. Behind him, a wall is covered with red, textured sound-dampening foam on the left, and a white banner on the right features the ""CHOICE FM"" logo and various social media handles like ""@ilovechoicefm"" with ""RALEIGH"" below it. The man intently addresses the microphone, articulating, <S>is talent. It's all about authenticity. You gotta be who you really are, especially if you're working<E>. He leans forward slightly as he speaks, maintaining a serious expression behind his sunglasses.. <AUDCAP>Clear male voice speaking into a microphone, a low background hum.<ENDAUDCAP>",
311
+ "example_prompts/pngs/5.png",
 
 
 
 
 
 
312
  50,
313
  ],
314
 
315
+
316
  ],
317
  inputs=[video_text_prompt, image, sample_steps],
318
  outputs=[output_path],