File size: 42,364 Bytes
1a87547
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
# AnimateDiff prompt travel

[AnimateDiff](https://github.com/guoyww/AnimateDiff) with prompt travel + [ControlNet](https://github.com/lllyasviel/ControlNet) + [IP-Adapter](https://github.com/tencent-ailab/IP-Adapter)

I added a experimental feature to animatediff-cli to change the prompt in the middle of the frame.

It seems to work surprisingly well!

### Example

- context_schedule "composite"
- pros : more stable animation
- cons : ignore prompts that require compositional changes
- "uniform"(default) / "composite"

<div><video controls src="https://github.com/s9roll7/animatediff-cli-prompt-travel/assets/118420657/717536f1-7c60-49a6-ad3a-de31d6b3efd9" muted="false"></video></div>
<br>





- controlnet for region
- controlnet_openpose for fg
- controlnet_tile(0.7) for bg
<div><video controls src="https://github.com/s9roll7/animatediff-cli-prompt-travel/assets/118420657/09fbf348-42ed-425c-9ec0-2865d233203a" muted="false"></video></div>
<br>


- added new controlnet [animatediff-controlnet](https://www.reddit.com/r/StableDiffusion/comments/183gt1g/animation_with_animatediff_and_retrained/)
- It works like ip2p and is very useful for replacing characters
- (This sample is generated at high resolution using the gradual latent hires fix)
- more example [here](https://github.com/s9roll7/animatediff-cli-prompt-travel/issues/189)
<div><video controls src="https://github.com/s9roll7/animatediff-cli-prompt-travel/assets/118420657/a867c480-3f1a-4e2c-874f-88c1a54e8903" muted="false"></video></div>
<br>


- gradual latent hires fix
- sd15 512x856 / sd15 768x1280 / sd15 768x1280 with gradual latent hires fix
- more example [here](https://github.com/s9roll7/animatediff-cli-prompt-travel/issues/188)
<div><video controls src="https://github.com/s9roll7/animatediff-cli-prompt-travel/assets/118420657/346c0541-7f05-4c45-ab2a-911bc0942fa8" muted="false"></video></div>
<br>


- [sdxl turbo lora](https://civitai.com/models/215485?modelVersionId=242807)
- more example [here](https://github.com/s9roll7/animatediff-cli-prompt-travel/issues/184)

<div><video controls src="https://github.com/s9roll7/animatediff-cli-prompt-travel/assets/118420657/15c39ac2-3853-44f5-b08c-142c90985c4b" muted="false"></video></div>
<br>

<br>

[Click here to see old samples.](example.md)

<br>
<br>


### Installation(for windows)
Same as the original animatediff  
[Python 3.10](https://www.python.org/) and git client must be installed  

(https://www.reddit.com/r/StableDiffusion/comments/157c0wl/working_animatediff_cli_windows_install/)  
  
I found a detailed tutorial  
(https://www.reddit.com/r/StableDiffusion/comments/16vlk9j/guide_to_creating_videos_with/)  
(https://www.youtube.com/watch?v=7_hh3wOD81s)  

### How To Use
Almost same as the original animatediff-cli, but with a slight change in config format.
```json

{
  "name": "sample",
  "path": "share/Stable-diffusion/mistoonAnime_v20.safetensors",  # Specify Checkpoint as a path relative to /animatediff-cli/data
  "lcm_map":{     # lcm-lora
    "enable":false,
    "start_scale":0.15,
    "end_scale":0.75,
    "gradient_start":0.2,
    "gradient_end":0.75
  },
  "gradual_latent_hires_fix_map":{ # gradual latent hires fix
    # This is an option to address the problem of chaos being generated when the model is generated beyond its proper size.
    # It also has the effect of increasing generation speed.
    "enable": false,    # enable/disable
    "scale": {    # "DENOISE PROGRESS" : LATENT SCALE format
      # In this example, Up to 70% of the total denoise, latent is halved to the specified size.
      # From 70% to the end, calculate the size as specified.
      "0": 0.5,
      "0.7": 1.0
    },
    "reverse_steps": 5,          # Number of reversal steps at latent size switching timing
    "noise_add_count":3          # Additive amount of noise at latent size switching timing
  },
  "vae_path":"share/VAE/vae-ft-mse-840000-ema-pruned.ckpt",       # Specify vae as a path relative to /animatediff-cli/data
  "motion_module": "models/motion-module/mm_sd_v14.ckpt",         # Specify motion module as a path relative to /animatediff-cli/data
  "context_schedule":"uniform",          # "uniform" or "composite"
  "compile": false,
  "seed": [
    341774366206100,-1,-1         # -1 means random. If "--repeats 3" is specified in this setting, The first will be 341774366206100, the second and third will be random.
  ],
  "scheduler": "ddim",      # "ddim","euler","euler_a","k_dpmpp_2m", etc...
  "steps": 40,
  "guidance_scale": 20,     # cfg scale
  "clip_skip": 2,
  "prompt_fixed_ratio": 0.5,
  "head_prompt": "masterpiece, best quality, a beautiful and detailed portriat of muffet, monster girl,((purple body:1.3)),humanoid, arachnid, anthro,((fangs)),pigtails,hair bows,5 eyes,spider girl,6 arms,solo",
  "prompt_map": {           # "FRAME" : "PROMPT" format / ex. prompt for frame 32 is "head_prompt" + prompt_map["32"] + "tail_prompt"
    "0":  "smile standing,((spider webs:1.0))",
    "32":  "(((walking))),((spider webs:1.0))",
    "64":  "(((running))),((spider webs:2.0)),wide angle lens, fish eye effect",
    "96":  "(((sitting))),((spider webs:1.0))"
  },
  "tail_prompt": "clothed, open mouth, awesome and detailed background, holding teapot, holding teacup, 6 hands,detailed hands,storefront that sells pastries and tea,bloomers,(red and black clothing),inside,pouring into teacup,muffetwear",
  "n_prompt": [
    "(worst quality, low quality:1.4),nudity,simple background,border,mouth closed,text, patreon,bed,bedroom,white background,((monochrome)),sketch,(pink body:1.4),7 arms,8 arms,4 arms"
  ],
  "lora_map": {             # "PATH_TO_LORA" : STRENGTH format
    "share/Lora/muffet_v2.safetensors" : 1.0,                     # Specify lora as a path relative to /animatediff-cli/data
    "share/Lora/add_detail.safetensors" : 1.0                     # Lora support is limited. Not all formats can be used!!!
  },
  "motion_lora_map": {      # "PATH_TO_LORA" : STRENGTH format
    "models/motion_lora/v2_lora_RollingAnticlockwise.ckpt":0.5,   # Currently, the officially distributed lora seems to work only for v2 motion modules (mm_sd_v15_v2.ckpt).
    "models/motion_lora/v2_lora_ZoomIn.ckpt":0.5
  },
  "ip_adapter_map": {       # config for ip-adapter
      # enable/disable (important)
      "enable": true,
      # Specify input image directory relative to /animatediff-cli/data (important! No need to specify frames in the config file. The effect on generation is exactly the same logic as the placement of the prompt)
      "input_image_dir": "ip_adapter_image/test",
      "prompt_fixed_ratio": 0.5,
      # save input image or not
      "save_input_image": true,
      # Ratio of image prompt vs text prompt (important). Even if you want to emphasize only the image prompt in 1.0, do not leave prompt/neg prompt empty, but specify a general text such as "best quality".
      "scale": 0.5,
      # IP-Adapter/IP-Adapter Full Face/IP-Adapter Plus Face/IP-Adapter Plus/IP-Adapter Light (important) It would be a completely different outcome. Not always PLUS a superior result.
      "is_full_face": false,
      "is_plus_face": false,
      "is_plus": true,
      "is_light": false
  },
  "img2img_map": {
      # enable/disable
      "enable": true,
      # Directory where the initial image is placed
      "init_img_dir": "..\\stylize\\2023-10-27T19-43-01-sample-mistoonanime_v20\\00_img2img",
      "save_init_image": true,
      # The smaller the value, the closer the result will be to the initial image.
      "denoising_strength": 0.7
  },
  "region_map": {
      # setting for region 0. You can also add regions if necessary.
      # The region added at the back will be drawn at the front.
      "0": {
          # enable/disable
          "enable": true,
          # If you want to draw a separate object for each region, enter a value of 0.1 or higher.
          "crop_generation_rate": 0.1,
          # Directory where mask images are placed
          "mask_dir": "..\\stylize\\2023-10-27T19-43-01-sample-mistoonanime_v20\\r_fg_00_2023-10-27T19-44-08\\00_mask",
          "save_mask": true,
          # If true, the initial image will be drawn as is (inpaint)
          "is_init_img": false,
          # conditions for region 0
          "condition": {
              # text prompt for region 0
              "prompt_fixed_ratio": 0.5,
              "head_prompt": "",
              "prompt_map": {
                  "0": "(masterpiece, best quality:1.2), solo, 1girl, kusanagi motoko, looking at viewer, jacket, leotard, thighhighs, gloves, cleavage"
               },
              "tail_prompt": "",
              # image prompt(ip adapter) for region 0
              # It is not possible to change lora for each region, but you can do something similar using an ip adapter.
              "ip_adapter_map": {
                  "enable": true,
                  "input_image_dir": "..\\stylize\\2023-10-27T19-43-01-sample-mistoonanime_v20\\r_fg_00_2023-10-27T19-44-08\\00_ipadapter",
                  "prompt_fixed_ratio": 0.5,
                  "save_input_image": true,
                  "resized_to_square": false
              }
          }
      },
      # setting for background
      "background": {
          # If true, the initial image will be drawn as is (inpaint)
          "is_init_img": true,
          "hint": "background's condition refers to the one in root"
      }
  },
  "controlnet_map": {       # config for controlnet(for generation)
    "input_image_dir" : "controlnet_image/test",    # Specify input image directory relative to /animatediff-cli/data (important! Please refer to the directory structure of sample. No need to specify frames in the config file.)
    "max_samples_on_vram" : 200,    # If you specify a large number of images for controlnet and vram will not be enough, reduce this value. 0 means that everything should be placed in cpu.
    "max_models_on_vram" : 3,       # Number of controlnet models to be placed in vram
    "save_detectmap" : true,        # save preprocessed image or not
    "preprocess_on_gpu": true,      # run preprocess on gpu or not (It probably does not affect vram usage at peak, so it should always set true.)
    "is_loop": true,                # Whether controlnet effects consider loop

    "controlnet_tile":{    # config for controlnet_tile
      "enable": true,              # enable/disable (important)
      "use_preprocessor":true,      # Whether to use a preprocessor for each controlnet type
      "preprocessor":{     # If not specified, the default preprocessor is selected.(Most of the time the default should be fine.)
        # none/blur/tile_resample/upernet_seg/ or key in controlnet_aux.processor.MODELS
        # https://github.com/patrickvonplaten/controlnet_aux/blob/2fd027162e7aef8c18d0a9b5a344727d37f4f13d/src/controlnet_aux/processor.py#L20
        "type" : "tile_resample",
        "param":{
          "down_sampling_rate":2.0
        }
      },
      "guess_mode":false,
      # control weight (important)
      "controlnet_conditioning_scale": 1.0,
      # starting control step
      "control_guidance_start": 0.0,
      # ending control step
      "control_guidance_end": 1.0,
      # list of influences on neighboring frames (important)
      # This means that there is an impact of 0.5 on both neighboring frames and 0.4 on the one next to it. Try lengthening, shortening, or changing the values inside.
      "control_scale_list":[0.5,0.4,0.3,0.2,0.1],
      # list of regions where controlnet works
      # In this example, it only affects region "0", but not "background".
      "control_region_list": ["0"]
    },
    "controlnet_ip2p":{
      "enable": true,
      "use_preprocessor":true,
      "guess_mode":false,
      "controlnet_conditioning_scale": 1.0,
      "control_guidance_start": 0.0,
      "control_guidance_end": 1.0,
      "control_scale_list":[0.5,0.4,0.3,0.2,0.1],
      # In this example, all regions are affected
      "control_region_list": []
    },
    "controlnet_lineart_anime":{
      "enable": true,
      "use_preprocessor":true,
      "guess_mode":false,
      "controlnet_conditioning_scale": 1.0,
      "control_guidance_start": 0.0,
      "control_guidance_end": 1.0,
      "control_scale_list":[0.5,0.4,0.3,0.2,0.1],
      # In this example, it only affects region "background", but not "0".
      "control_region_list": ["background"]
    },
    "controlnet_openpose":{
      "enable": true,
      "use_preprocessor":true,
      "guess_mode":false,
      "controlnet_conditioning_scale": 1.0,
      "control_guidance_start": 0.0,
      "control_guidance_end": 1.0,
      "control_scale_list":[0.5,0.4,0.3,0.2,0.1],
      # In this example, all regions are affected (since these are the only two regions defined)
      "control_region_list": ["0", "background"]
    },
    "controlnet_softedge":{
      "enable": true,
      "use_preprocessor":true,
      "preprocessor":{
        "type" : "softedge_pidsafe",
        "param":{
        }
      },
      "guess_mode":false,
      "controlnet_conditioning_scale": 1.0,
      "control_guidance_start": 0.0,
      "control_guidance_end": 1.0,
      "control_scale_list":[0.5,0.4,0.3,0.2,0.1]
    },
    "controlnet_ref": {
        "enable": false,            # enable/disable (important)
        "ref_image": "ref_image/ref_sample.png",     # path to reference image.
        "attention_auto_machine_weight": 1.0,
        "gn_auto_machine_weight": 1.0,
        "style_fidelity": 0.5,                # control weight-like parameter(important)
        "reference_attn": true,               # [attn=true , adain=false] means "reference_only"
        "reference_adain": false,
        "scale_pattern":[0.5]                 # Pattern for applying controlnet_ref to frames
    }                                         # ex. [0.5] means [0.5,0.5,0.5,0.5,0.5 .... ]. All frames are affected by 50%
                                              # ex. [1, 0] means [1,0,1,0,1,0,1,0,1,0,1 ....]. Only even frames are affected by 100%.
  },
  "upscale_config": {       # config for tile-upscale
    "scheduler": "ddim",
    "steps": 20,
    "strength": 0.5,
    "guidance_scale": 10,
    "controlnet_tile": {    # config for controlnet tile
      "enable": true,       # enable/disable (important)
      "controlnet_conditioning_scale": 1.0,     # control weight (important)
      "guess_mode": false,
      "control_guidance_start": 0.0,      # starting control step
      "control_guidance_end": 1.0         # ending control step
    },
    "controlnet_line_anime": {  # config for controlnet line anime
      "enable": false,
      "controlnet_conditioning_scale": 1.0,
      "guess_mode": false,
      "control_guidance_start": 0.0,
      "control_guidance_end": 1.0
    },
    "controlnet_ip2p": {  # config for controlnet ip2p
      "enable": false,
      "controlnet_conditioning_scale": 0.5,
      "guess_mode": false,
      "control_guidance_start": 0.0,
      "control_guidance_end": 1.0
    },
    "controlnet_ref": {   # config for controlnet ref
      "enable": false,             # enable/disable (important)
      "use_frame_as_ref_image": false,   # use original frames as ref_image for each upscale (important)
      "use_1st_frame_as_ref_image": false,   # use 1st original frame as ref_image for all upscale (important)
      "ref_image": "ref_image/path_to_your_ref_img.jpg",   # use specified image file as ref_image for all upscale (important)
      "attention_auto_machine_weight": 1.0,
      "gn_auto_machine_weight": 1.0,
      "style_fidelity": 0.25,       # control weight-like parameter(important)
      "reference_attn": true,       # [attn=true , adain=false] means "reference_only"
      "reference_adain": false
    }
  },
  "output":{   # output format 
    "format" : "gif",   # gif/mp4/webm
    "fps" : 8,
    "encode_param":{
      "crf": 10
    }
  }
}
```

```sh
python3 -m pip install -U realesrgan imageio-ffmpeg
python3 - <<EOF
import fileinput, importlib
for line in fileinput.input(importlib.util.find_spec('basicsr').origin.replace('__init__', 'data/degradations'), inplace=True): print(line.replace('torchvision.transforms.functional_tensor', 'torchvision.transforms.functional'), end='')
for line in fileinput.input(importlib.util.find_spec('facexlib.detection').origin, inplace=True): print(line.replace('https://github.com/xinntao/facexlib/releases/download/v0.1.0/detection_Resnet50_Final.pth', 'https://huggingface.co/chaowenguo/pal/resolve/main/detection_Resnet50_Final.pth'), end='')
for line in fileinput.input(importlib.util.find_spec('facexlib.parsing').origin, inplace=True): print(line.replace('https://github.com/xinntao/facexlib/releases/download/v0.2.2/parsing_parsenet.pth', 'https://huggingface.co/chaowenguo/pal/resolve/main/parsing_parsenet.pth'), end='')
EOF

git clone https://bitbucket.org/chaowenguo/animatediff
cd animatediff
python3 -m pip install -U . torchvision==0.21 onnxruntime-gpu moviepy
cd ..
cp -r animatediff/config .

curl -L -O https://huggingface.co/chaowenguo/AnimateLCM/resolve/main/AnimateLCM_sd15_t2v.ckpt
cat <<EOF > config/prompts/prompt_travel.json
{
  "name": "sample",
  "path": "",
  "motion_module": "", 
  "lcm_map":{
    "enable":true,
    "start_scale":0.15,
    "end_scale":0.75,
    "gradient_start":0.2,
    "gradient_end":0.75
  },
  "seed": [
     1
  ],
  "scheduler": "lcm",
  "steps": 8,
  "guidance_scale": 3,
  "unet_batch_size": 1,
  "clip_skip": 2,
  "prompt_fixed_ratio": 1,
  "head_prompt": "A full body gorgeous smiling slim young cleavage robust boob japanese girl, beautiful face, wearing skirt, standing on beach, two hands each with five fingers, two arms, front view",
  "prompt_map": {
      "0": "waving hand, open palm"
  },
  "tail_prompt": "best quality, extremely detailed, HD, ultra-realistic, 8K, HQ, masterpiece, trending on artstation, art, smooth",
  "n_prompt": [
    "(nipple:1.4), dudou, shirt, skirt, collar, shawl, hat, sock, sleeve, glove, headgear, back view, monochrome, longbody, lowres, bad anatomy, bad hands, fused fingers, missing fingers, too many fingers, extra digit, fewer digits, cropped, worst quality, low quality, deformed body, bloated, ugly, unrealistic, extra hands and arms"
  ],
  "lora_map": {},
  "motion_lora_map": {}
}
EOF
```

```py
import basicsr, realesrgan, gfpgan, imageio, pathlib, diffusers, torch, transformers, moviepy, builtins, numpy, re
from animatediff import get_dir
from animatediff.generate import (controlnet_preprocess, create_pipeline,
                                  create_us_pipeline, img2img_preprocess,
                                  ip_adapter_preprocess,
                                  load_controlnet_models, prompt_preprocess,
                                  region_preprocess, run_inference,
                                  run_upscale, save_output,
                                  unload_controlnet_models,
                                  wild_card_conversion)
from animatediff.settings import (CKPT_EXTENSIONS, InferenceConfig,
                                  ModelConfig, get_infer_config,
                                  get_model_config)
from animatediff.utils.model import (checkpoint_to_pipeline,
                                     fix_checkpoint_if_needed, get_base_model)
from animatediff.utils.pipeline import get_context_params, send_to_device
from animatediff.utils.util import (is_sdxl_checkpoint,
                                    is_v2_motion_module,
                                    set_tensor_interpolation_method)
from animatediff.pipelines import load_text_embeddings
from animatediff.schedulers import DiffusionScheduler, get_scheduler
from animatediff.pipelines.lora import load_lcm_lora, load_lora_map
import huggingface_hub
import animatediff

width=432
height=768
length=1440
model_config = get_model_config('config/prompts/prompt_travel.json')
is_sdxl = False
is_v2 = True
infer_config = get_infer_config(is_v2, is_sdxl)
set_tensor_interpolation_method(model_config.tensor_interpolation_slerp)
device = torch.device('cuda')
save_dir = pathlib.Path('output')
controlnet_image_map, controlnet_type_map, controlnet_ref_map, controlnet_no_shrink = controlnet_preprocess(model_config.controlnet_map, width, height, length, save_dir, device, is_sdxl)
img2img_map = img2img_preprocess(model_config.img2img_map, width, height, length, save_dir)

base_model = pathlib.Path('/tmp/base')
diffusers.StableDiffusionPipeline.from_pretrained('chaowenguo/stable-diffusion-v1-5').save_pretrained(base_model)

tokenizer = transformers.CLIPTokenizer.from_pretrained(base_model, subfolder='tokenizer')
text_encoder = transformers.CLIPTextModel.from_pretrained(base_model, subfolder='text_encoder')
vae = diffusers.AutoencoderKL.from_pretrained(base_model, subfolder='vae')
unet = animatediff.models.unet.UNet3DConditionModel.from_pretrained_2d(
    pretrained_model_path=base_model,
    motion_module_path=pathlib.Path.cwd().joinpath('AnimateLCM_sd15_t2v.ckpt'),
    subfolder='unet',
    unet_additional_kwargs=infer_config.unet_additional_kwargs,
)
feature_extractor = transformers.CLIPImageProcessor.from_pretrained(base_model, subfolder='feature_extractor')

pipeline = diffusers.StableDiffusionPipeline.from_single_file('https://huggingface.co/chaowenguo/pal/blob/main/chilloutMix-Ni.safetensors',config='chaowenguo/stable-diffusion-v1-5', safety_checker=None, use_safetensors=True)
unet.load_state_dict(pipeline.unet.state_dict(), strict=False)
text_encoder.load_state_dict(pipeline.text_encoder.state_dict(), strict=False)
vae.load_state_dict(pipeline.vae.state_dict(), strict=False)
del pipeline
unet.enable_xformers_memory_efficient_attention()

pipeline = animatediff.pipelines.AnimationPipeline(
    vae=vae,
    text_encoder=text_encoder,
    tokenizer=tokenizer,
    unet=unet,
    scheduler=get_scheduler(model_config.scheduler, infer_config.noise_scheduler_kwargs),
    feature_extractor=feature_extractor,
    controlnet_map=None,
)

lcm_lora = pathlib.Path.cwd().joinpath('data/models/lcm_lora/sd15')
lcm_lora.mkdir(parents=True)
huggingface_hub.hf_hub_download(repo_id='chaowenguo/AnimateLCM', filename='AnimateLCM_sd15_t2v_lora.safetensors', local_dir=lcm_lora)
load_lcm_lora(pipeline, model_config.lcm_map, is_sdxl=is_sdxl)
load_lora_map(pipeline, model_config.lora_map, length, is_sdxl=is_sdxl)

pipeline.unet = pipeline.unet.half()
pipeline.text_encoder = pipeline.text_encoder.half()
pipeline.text_encoder = pipeline.text_encoder.to(device)
load_text_embeddings(pipeline)
pipeline.text_encoder = pipeline.text_encoder.to('cpu')

pipeline = send_to_device(pipeline, device, freeze=True, force_half=False, compile=False, is_sdxl=is_sdxl)

wild_card_conversion(model_config)

is_init_img_exist = img2img_map != None
region_condi_list, region_list, ip_adapter_config_map, region2index = region_preprocess(model_config, width, height, length, save_dir, is_init_img_exist, is_sdxl)

if controlnet_type_map:
    for c in controlnet_type_map:
        tmp_r = [region2index[r] for r in controlnet_type_map[c]["control_region_list"]]
        controlnet_type_map[c]["control_region_list"] = [r for r in tmp_r if r != -1]
        logger.info(f"{c=} / {controlnet_type_map[c]['control_region_list']}")    

prompt_map = region_condi_list[0]["prompt_map"]
prompt_tags = [re.compile(r"[^\w\-, ]").sub("", tag).strip().replace(" ", "-") for tag in prompt_map[list(prompt_map.keys())[0]].split(",")]
prompt_str = "_".join((prompt_tags[:6]))[:50]

torch.manual_seed(0)

output = pipeline(
    n_prompt='(nipple:1.4), dudou, shirt, skirt, collar, shawl, hat, sock, sleeve, glove, headgear, back view, monochrome, longbody, lowres, bad anatomy, bad hands, fused fingers, missing fingers, too many fingers, extra digit, fewer digits, cropped, worst quality, low quality, deformed body, bloated, ugly, unrealistic, extra hands and arms',
    num_inference_steps=8,
    guidance_scale=3,
    unet_batch_size=1,
    width=width,
    height=height,
    video_length=length,
    return_dict=False,
    context_frames=16,
    context_stride=1,
    context_overlap=16 // 4,
    context_schedule='composite',
    clip_skip=2,
    controlnet_type_map=controlnet_image_map,
    controlnet_image_map=controlnet_image_map,
    controlnet_ref_map=controlnet_ref_map,
    controlnet_no_shrink=controlnet_no_shrink,
    controlnet_max_samples_on_vram=model_config.controlnet_map["max_samples_on_vram"] if "max_samples_on_vram" in model_config.controlnet_map else 999,
    controlnet_max_models_on_vram=model_config.controlnet_map["max_models_on_vram"] if "max_models_on_vram" in model_config.controlnet_map else 99,
    controlnet_is_loop = model_config.controlnet_map["is_loop"] if "is_loop" in model_config.controlnet_map else True,
    img2img_map=img2img_map,
    ip_adapter_config_map=ip_adapter_config_map,
    region_list=region_list,
    region_condi_list=region_condi_list,
    interpolation_factor=1,
    is_single_prompt_mode=model_config.is_single_prompt_mode,
    apply_lcm_lora=True,
    gradual_latent_map=model_config.gradual_latent_hires_fix_map,
    callback=None,
    callback_steps=None,
)

unload_controlnet_models(pipe=pipeline)
frames = output.permute(0, 2, 1, 3, 4).squeeze(0)
frames = frames.mul(255).add_(0.5).clamp_(0, 255).permute(0, 2, 3, 1).to("cpu", torch.uint8).numpy()
with imageio.get_writer('tmp.mp4', fps=8) as writer:
    for frame in frames: writer.append_data(frame)

del pipeline
torch.cuda.empty_cache()
model = basicsr.archs.rrdbnet_arch.RRDBNet(num_in_ch=3, num_out_ch=3, num_feat=64, num_block=23, num_grow_ch=32, scale=4)
upsampler = realesrgan.RealESRGANer(scale=4, model_path='https://huggingface.co/chaowenguo/pal/resolve/main/RealESRGAN_x4plus.pth', model=model, half=True, device='cuda')
face_enhancer = gfpgan.GFPGANer(model_path='https://huggingface.co/chaowenguo/pal/resolve/main/GFPGANv1.4.pth',upscale=4, bg_upsampler=upsampler)
with imageio.get_reader('tmp.mp4') as reader, imageio.get_writer('enhance.mp4', fps=reader.get_meta_data()['fps']) as writer:
    for frame in reader: writer.append_data(face_enhancer.enhance(frame)[-1])

processor = transformers.AutoProcessor.from_pretrained('chaowenguo/musicgen')
music = transformers.MusicgenMelodyForConditionalGeneration.from_pretrained('chaowenguo/musicgen', torch_dtype=torch.float16).to('cuda')
result = []
for _ in builtins.range(9):
    inputs = processor(audio=result[-1] if result else None, sampling_rate=music.config.audio_encoder.sampling_rate, text='A grand and majestic symphony with soaring strings, powerful brass, and dynamic orchestration. Inspired by Beethoven and Tchaikovsky, featuring dramatic crescendos, delicate woodwind passages, and a triumphant finale. The mood is epic, emotional, and timeless', padding=True, return_tensors='pt').to('cuda')
    inputs = {key:inputs.get(key) if key != 'input_features' else inputs.get(key).to(dtype=music.dtype) for key in inputs}
    audio_values = music.generate(**inputs, max_new_tokens=1000)
    result += audio_values[0, 0].cpu().numpy(),

video = moviepy.VideoFileClip('enhance.mp4')
video.with_audio(moviepy.AudioArrayClip(numpy.concatenate(result)[None].T, 2 * music.config.audio_encoder.sampling_rate)).write_videofile('video.mp4')
```

```sh
# upscale using controlnet (tile, line anime, ip2p, ref)
# specify the directory of the frame generated in the above step
# default config path is 'frames_dir/../prompt.json'
# here, width=512 is specified, but even if the original size is 512, it is effective in increasing detail
animatediff tile-upscale PATH_TO_TARGET_FRAME_DIRECTORY -c config/prompts/prompt_travel.json -W 512

# upscale width to 768 (smoother than tile-upscale)
animatediff refine PATH_TO_TARGET_FRAME_DIRECTORY -W 768
# If generation takes an unusually long time, there is not enough vram.
# Give up large size or reduce the size of the context.
animatediff refine PATH_TO_TARGET_FRAME_DIRECTORY -W 1024 -C 6

# change lora and prompt to make minor changes to the video.
animatediff refine PATH_TO_TARGET_FRAME_DIRECTORY -c config/prompts/some_minor_changed.json
```

#### Video Stylization
```sh
cd animatediff-cli-prompt-travel
venv\Scripts\activate.bat

# If you want to use the 'stylize' command, additional installation required
python -m pip install -e .[stylize]

# create config file from src video
animatediff stylize create-config YOUR_SRC_MOVIE_FILE.mp4

# create config file from src video (img2img)
animatediff stylize create-config YOUR_SRC_MOVIE_FILE.mp4 -i2i

# If you have less than 12GB of vram, specify low vram mode
animatediff stylize create-config YOUR_SRC_MOVIE_FILE.mp4 -lo

# Edit the config file by referring to the hint displayed in the log when the command finishes
# It is recommended to specify a short length for the test run

# generate(test run)
# 16 frames
animatediff stylize generate STYLYZE_DIR -L 16
# 16 frames from the 200th frame
animatediff stylize generate STYLYZE_DIR -L 16 -FO 200

# If generation takes an unusually long time, there is not enough vram.
# Give up large size or reduce the size of the context.

# generate
animatediff stylize generate STYLYZE_DIR
```

#### Video Stylization with region
```sh
cd animatediff-cli-prompt-travel
venv\Scripts\activate.bat

# If you want to use the 'stylize create-region' command, additional installation required
python -m pip install -e .[stylize_mask]

# [1] create config file from src video
animatediff stylize create-config YOUR_SRC_MOVIE_FILE.mp4
# for img2img
animatediff stylize create-config YOUR_SRC_MOVIE_FILE.mp4 -i2i

# If you have less than 12GB of vram, specify low vram mode
animatediff stylize create-config YOUR_SRC_MOVIE_FILE.mp4 -lo
```
```json
# in prompt.json (generated in [1])
# [2] write the object you want to mask
# ex.) If you want to mask a person
    "stylize_config": {
        "create_mask": [
            "person"
        ],
        "composite": {
```
```sh
# [3] generate region
animatediff stylize create-region STYLYZE_DIR

# If you have less than 12GB of vram, specify low vram mode
animatediff stylize create-region STYLYZE_DIR -lo

("animatediff stylize create-region -h" for help)
```
```json
# in prompt.json (generated in [1])
[4] edit region_map,prompt,controlnet setting. Put the image you want to reference in the ip adapter directory (both background and region)
  "region_map": {
      "0": {
          "enable": true,
          "mask_dir": "..\\stylize\\2023-10-27T19-43-01-sample-mistoonanime_v20\\r_fg_00_2023-10-27T19-44-08\\00_mask",
          "save_mask": true,
          "is_init_img": false, # <----------
          "condition": {
              "prompt_fixed_ratio": 0.5,
              "head_prompt": "",  # <----------
              "prompt_map": {  # <----------
                  "0": "(masterpiece, best quality:1.2), solo, 1girl, kusanagi motoko, looking at viewer, jacket, leotard, thighhighs, gloves, cleavage"
               },
              "tail_prompt": "",  # <----------
              "ip_adapter_map": {
                  "enable": true,
                  "input_image_dir": "..\\stylize\\2023-10-27T19-43-01-sample-mistoonanime_v20\\r_fg_00_2023-10-27T19-44-08\\00_ipadapter",
                  "prompt_fixed_ratio": 0.5,
                  "save_input_image": true,
                  "resized_to_square": false
              }
          }
      },
      "background": {
          "is_init_img": false,  # <----------
          "hint": "background's condition refers to the one in root"
      }
  },
```
```sh
# [5] generate
animatediff stylize generate STYLYZE_DIR
```


#### Video Stylization with mask
```sh
cd animatediff-cli-prompt-travel
venv\Scripts\activate.bat

# If you want to use the 'stylize create-mask' command, additional installation required
python -m pip install -e .[stylize_mask]

# [1] create config file from src video
animatediff stylize create-config YOUR_SRC_MOVIE_FILE.mp4

# If you have less than 12GB of vram, specify low vram mode
animatediff stylize create-config YOUR_SRC_MOVIE_FILE.mp4 -lo
```
```json
# in prompt.json (generated in [1])
# [2] write the object you want to mask
# ex.) If you want to mask a person
    "stylize_config": {
        "create_mask": [
            "person"
        ],
        "composite": {
```
```json
# ex.) person, dog, cat
    "stylize_config": {
        "create_mask": [
            "person", "dog", "cat"
        ],
        "composite": {
```
```json
# ex.) boy, girl
    "stylize_config": {
        "create_mask": [
            "boy", "girl"
        ],
        "composite": {
```
```sh
# [3] generate mask
animatediff stylize create-mask STYLYZE_DIR

# If you have less than 12GB of vram, specify low vram mode
animatediff stylize create-mask STYLYZE_DIR -lo

# The foreground is output to the following directory (FG_STYLYZE_DIR)
# STYLYZE_DIR/fg_00_timestamp_str
# The background is output to the following directory (BG_STYLYZE_DIR)
# STYLYZE_DIR/bg_timestamp_str

("animatediff stylize create-mask -h" for help)

# [4] generate foreground
animatediff stylize generate FG_STYLYZE_DIR

# Same as normal generate.
# The default is controlnet_tile, so if you want to make a big style change,
# such as changing the character, change to openpose, etc.

# Of course, you can also generate the background here.
```
```json
# in prompt.json (generated in [1])
# [5] composite setup
# enter the directory containing the frames generated in [4] in "fg_list".
# In the "mask_prompt" field, write the object you want to extract from the generated foreground frame.
# If you prepared the mask yourself, specify it in mask_path. If a valid path is set, use it.
# If the shape has not changed when the foreground is generated, FG_STYLYZE_DIR/00_mask can be used
# enter the directory containing the background frames separated in [3] in "bg_frame_dir".
        "composite": {
            "fg_list": [
                {
                    "path": "FG_STYLYZE_DIR/time_stamp_str/00-341774366206100",
                    "mask_path": " absolute path to mask dir (this is optional) ",
                    "mask_prompt": "person"
                },
                {
                    "path": " absolute path to frame dir ",
                    "mask_path": " absolute path to mask dir (this is optional) ",
                    "mask_prompt": "cat"
                }
            ],
            "bg_frame_dir": "BG_STYLYZE_DIR/00_controlnet_image/controlnet_tile",
            "hint": ""
        },
```
```sh
# [6] composite
animatediff stylize composite STYLYZE_DIR

# By default, "sam hq" and "groundingdino" are used for cropping, but it is not always possible to crop the image well.
# In that case, you can try "rembg" or "anime-segmentation".
# However, when using "rembg" and "anime-segmentation", you cannot specify the target text to be clipped.
animatediff stylize composite STYLYZE_DIR -rem
animatediff stylize composite STYLYZE_DIR -anim

# See help for detailed options. (animatediff stylize composite -h)
```


#### Auto config generation for [Stable-Diffusion-Webui-Civitai-Helper](https://github.com/butaixianran/Stable-Diffusion-Webui-Civitai-Helper) user
```sh
# This command parses the *.civitai.info files and automatically generates config files
# See "animatediff civitai2config -h" for details
animatediff civitai2config PATH_TO_YOUR_A111_LORA_DIR
```
#### Wildcard
- you can pick wildcard up at [civitai](https://civitai.com/models/23799/freecards). then, put them in /wildcards. 
- Usage is the same as a1111.(  \_\_WILDCARDFILENAME\_\_ format, 
ex.  \_\_animal\_\_ for animal.txt. \_\_background-color\_\_ for background-color.txt.)
```json
  "prompt_map": {           # __WILDCARDFILENAME__
    "0":  "__character-posture__, __character-gesture__, __character-emotion__, masterpiece, best quality, a beautiful and detailed portriat of muffet, monster girl,((purple body:1.3)), __background__",
```
### Recommended setting
- checkpoint : [mistoonAnime_v20](https://civitai.com/models/24149/mistoonanime) for anime, [xxmix9realistic_v40](https://civitai.com/models/47274) for photoreal
- scheduler : "k_dpmpp_sde"
- upscale : Enable controlnet_tile and controlnet_ip2p only.
- lora and ip adapter

### Recommended settings for 8-12 GB of vram
- max_samples_on_vram : 0
- max_models_on_vram : 0
- Generate at lower resolution and upscale to higher resolution with lower the value of context.
- In the latest version, the amount of vram used during generation has been reduced.
```sh
animatediff generate -c config/prompts/your_config.json -W 384 -H 576 -L 48 -C 16
animatediff tile-upscale output/2023-08-25T20-00-00-sample-mistoonanime_v20/00-341774366206100 -W 512
```

### Limitations
- lora support is limited. Not all formats can be used!!!
- It is not possible to specify lora in the prompt.

### Related resources
- [AnimateDiff](https://github.com/guoyww/AnimateDiff)
- [ControlNet](https://github.com/lllyasviel/ControlNet)
- [IP-Adapter](https://github.com/tencent-ailab/IP-Adapter)
- [DWPose](https://github.com/IDEA-Research/DWPose)
- [softmax-splatting](https://github.com/sniklaus/softmax-splatting)
- [sam-hq](https://github.com/SysCV/sam-hq)
- [Grounded-Segment-Anything](https://github.com/IDEA-Research/Grounded-Segment-Anything)
- [ProPainter](https://github.com/sczhou/ProPainter)
- [rembg](https://github.com/danielgatis/rembg)
- [anime-segmentation](https://github.com/SkyTNT/anime-segmentation)
- [LCM-LoRA](https://github.com/luosiallen/latent-consistency-model)
- [ControlNet-LLLite](https://github.com/kohya-ss/sd-scripts/blob/main/docs/train_lllite_README.md)
- [Gradual Latent hires fix](https://github.com/kohya-ss/sd-scripts/tree/gradual_latent_hires_fix)
<br>
<br>
<br>
<br>
<br>

Below is the original readme.  

----------------------------------------------------------  


# animatediff
[![pre-commit.ci status](https://results.pre-commit.ci/badge/github/neggles/animatediff-cli/main.svg)](https://results.pre-commit.ci/latest/github/neggles/animatediff-cli/main)

animatediff refactor, ~~because I can.~~ with significantly lower VRAM usage.

Also, **infinite generation length support!** yay!

# LoRA loading is ABSOLUTELY NOT IMPLEMENTED YET!

This can theoretically run on CPU, but it's not recommended. Should work fine on a GPU, nVidia or otherwise,
but I haven't tested on non-CUDA hardware. Uses PyTorch 2.0 Scaled-Dot-Product Attention (aka builtin xformers)
by default, but you can pass `--xformers` to force using xformers if you *really* want.

### How To Use

1. Lie down
2. Try not to cry
3. Cry a lot

### but for real?

Okay, fine. But it's still a little complicated and there's no webUI yet.

```sh
git clone https://github.com/neggles/animatediff-cli
cd animatediff-cli
python3.10 -m venv .venv
source .venv/bin/activate
# install Torch. Use whatever your favourite torch version >= 2.0.0 is, but, good luck on non-nVidia...
python -m pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
# install the rest of all the things (probably! I may have missed some deps.)
python -m pip install -e '.[dev]'
# you should now be able to
animatediff --help
# There's a nice pretty help screen with a bunch of info that'll print here.
```

From here you'll need to put whatever checkpoint you want to use into `data/models/sd`, copy
one of the prompt configs in `config/prompts`, edit it with your choices of prompt and model (model
paths in prompt .json files are **relative to `data/`**, e.g. `models/sd/vanilla.safetensors`), and
off you go.

Then it's something like (for an 8GB card):
```sh
animatediff generate -c 'config/prompts/waifu.json' -W 576 -H 576 -L 128 -C 16
```
You may have to drop `-C` down to 8 on cards with less than 8GB VRAM, and you can raise it to 20-24
on cards with more. 24 is max.

N.B. generating 128 frames is _**slow...**_

## RiFE!

I have added experimental support for [rife-ncnn-vulkan](https://github.com/nihui/rife-ncnn-vulkan)
using the `animatediff rife interpolate` command. It has fairly self-explanatory help, and it has
been tested on Linux, but I've **no idea** if it'll work on Windows.

Either way, you'll need ffmpeg installed on your system and present in PATH, and you'll need to
download the rife-ncnn-vulkan release for your OS of choice from the GitHub repo (above). Unzip it, and
place the extracted folder at `data/rife/`. You should have a `data/rife/rife-ncnn-vulkan` executable, or `data\rife\rife-ncnn-vulkan.exe` on Windows.

You'll also need to reinstall the repo/package with:
```py
python -m pip install -e '.[rife]'
```
or just install `ffmpeg-python` manually yourself.

Default is to multiply each frame by 8, turning an 8fps animation into a 64fps one, then encode
that to a 60fps WebM. (If you pick GIF mode, it'll be 50fps, because GIFs are cursed and encode
frame durations as 1/100ths of a second).

Seems to work pretty well...

## TODO:

In no particular order:

- [x] Infinite generation length support
- [x] RIFE support for motion interpolation (`rife-ncnn-vulkan` isn't the greatest implementation)
- [x] Export RIFE interpolated frames to a video file (webm, mp4, animated webp, hevc mp4, gif, etc.)
- [x] Generate infinite length animations on a 6-8GB card (at 512x512 with 8-frame context, but hey it'll do)
- [x] Torch SDP Attention (makes xformers optional)
- [x] Support for `clip_skip` in prompt config
- [x] Experimental support for `torch.compile()` (upstream Diffusers bugs slow this down a little but it's still zippy)
- [x] Batch your generations with `--repeat`! (e.g. `--repeat 10` will repeat all your prompts 10 times)
- [x] Call the `animatediff.cli.generate()` function from another Python program without reloading the model every time
- [x] Drag remaining old Diffusers code up to latest (mostly)
- [ ] Add a webUI (maybe, there are people wrapping this already so maybe not?)
- [ ] img2img support (start from an existing image and continue)
- [ ] Stop using custom modules where possible (should be able to use Diffusers for almost all of it)
- [ ] Automatic generate-then-interpolate-with-RIFE mode

## Credits:

see [guoyww/AnimateDiff](https://github.com/guoyww/AnimateDiff) (very little of this is my work)

n.b. the copyright notice in `COPYING` is missing the original authors' names, solely because
the original repo (as of this writing) has no name attached to the license. I have, however,
used the same license they did (Apache 2.0).