wan-2.1-i2v-480p | Fast Image-to-Video Generation with 480p Output

VModel/wan-2.1-i2v-480p

Experience accelerated image-to-video generation with wan-2.1-i2v-480p. Powered by the Wan 2.1 14B model suite, delivering efficient 480p video synthesis for research and production use.

Output: $0.5 / use or 2 uses / $1

Input

prompt * string

Prompt for video generation

image * image

Input image to start generating from

num_frames int

Number of video frames. 81 frames give the best results

max_area enum

Maximum area of generated image. The input image will shrink to fit these dimensions

frames_per_second int

Frames per second. Note that the pricing of this model is based on the video duration at 16 fps

fast_mode enum

Speed up generation with different levels of acceleration. Faster modes may degrade quality somewhat. The speedup is dependent on the content, so different videos may see different speedups.

sample_steps int

Number of generation steps. Fewer steps means faster generation, at the expensive of output quality. 30 steps is sufficient for most prompts

sample_guide_scale int

Higher guide scale makes prompt adherence better, but can reduce variation

sample_shift int

Sample shift factor

lora_weights string

Load LoRA weights. Supports Replicate models in the format <owner>/<username> or <owner>/<username>/<version>, HuggingFace URLs in the format huggingface.co/<owner>/<model-name>, CivitAI URLs in the format civitai.com/models/<id>[/<model-name>], or arbitrary .safetensors URLs from the Internet. For example, 'fofr/flux-pixar-cars'

lora_scale int

Determines how strongly the main LoRA should be applied. Sane results between 0 and 1 for base inference. For go_fast we apply a 1.5x multiplier to this value; we've generally seen good performance when scaling the base value by that amount. You may still need to experiment to find the best value for your particular lora.

seed int

Random seed. Leave blank to randomize the seed

disable_safety_checker boolean

Note: The website version of this model always runs with safety checks enabled. For details,see VModel's platform safety guidelines..

Disable safety checker for generated images

Reset

Output

{
  "task_id": "qaldrg3a9d9mfiw2tf",
  "user_id": 1,
  "version": "009719e7de9128f21878a3c96fe39663cc29c7d37103ca0b59f8a5d5b15ff73e",
  "error": null,
  "total_time": 30.8,
  "predict_time": 30.2,
  "logs": null,
  "output": [
    "https://vmodel.ai/data/model/vmodel/wan-2.1-i2v-480p/result.mp4"
  ],
  "status": "succeeded",
  "create_at": null,
  "input": {
    "prompt": "A woman is talking",
    "image": "https://vmodel.ai/data/model/vmodel/wan-2.1-i2v-480p/2.png",
    "max_area": "832x480",
    "fast_mode": "Balanced",
    "lora_scale": 1,
    "num_frames": 81,
    "sample_shift": 3,
    "sample_steps": 30,
    "frames_per_second": 16,
    "sample_guide_scale": 5,
    "disable_safety_checker": false
  }
}

Generated in: 30.2 seconds

Download

Input

prompt * string

Prompt for video generation

image * image

Input image to start generating from

num_frames int

Number of video frames. 81 frames give the best results

max_area enum

Maximum area of generated image. The input image will shrink to fit these dimensions

frames_per_second int

Frames per second. Note that the pricing of this model is based on the video duration at 16 fps

fast_mode enum

Speed up generation with different levels of acceleration. Faster modes may degrade quality somewhat. The speedup is dependent on the content, so different videos may see different speedups.

sample_steps int

Number of generation steps. Fewer steps means faster generation, at the expensive of output quality. 30 steps is sufficient for most prompts

sample_guide_scale int

Higher guide scale makes prompt adherence better, but can reduce variation

sample_shift int

Sample shift factor

lora_weights string

lora_scale int

seed int

Random seed. Leave blank to randomize the seed

disable_safety_checker boolean

Note: The website version of this model always runs with safety checks enabled. For details,see VModel's platform safety guidelines..

Disable safety checker for generated images

Reset

Output

{
  "task_id": "qaldrg3a9d9mfiw2tf",
  "user_id": 1,
  "version": "009719e7de9128f21878a3c96fe39663cc29c7d37103ca0b59f8a5d5b15ff73e",
  "error": null,
  "total_time": 30.8,
  "predict_time": 30.2,
  "logs": null,
  "output": [
    "https://vmodel.ai/data/model/vmodel/wan-2.1-i2v-480p/result.mp4"
  ],
  "status": "succeeded",
  "create_at": null,
  "input": {
    "prompt": "A woman is talking",
    "image": "https://vmodel.ai/data/model/vmodel/wan-2.1-i2v-480p/2.png",
    "max_area": "832x480",
    "fast_mode": "Balanced",
    "lora_scale": 1,
    "num_frames": 81,
    "sample_shift": 3,
    "sample_steps": 30,
    "frames_per_second": 16,
    "sample_guide_scale": 5,
    "disable_safety_checker": false
  }
}

Generated in: 30.2 seconds

Download

HTTP Request

Run vmodel/wan-2.1-i2v-480p:009719e7de9128f21878a3c96fe39663cc29c7d37103ca0b59f8a5d5b15ff73e using Vmodel's HTTP API.

  curl -X POST https://api.vmodel.ai/api/tasks/v1/create
    -H "Authorization: Bearer $VModel_API_TOKEN"
    -H "Content-Type: application/json"
    -d '{
    "version": "009719e7de9128f21878a3c96fe39663cc29c7d37103ca0b59f8a5d5b15ff73e",
    "input": {}
}'

Input Schema

The fields you can use to run this model with an API. If you don't give a value for a field its default value will be used.

prompt

Type: string

Default value: -

Description: Prompt for video generation

image

Type: image

Default value: -

Description: Input image to start generating from

num_frames

Type: int

Default value: 81

Description: Number of video frames. 81 frames give the best results

Range: Min: 81 | Max: 100

max_area

Type: enum

Default value: 832x480

Description: Maximum area of generated image. The input image will shrink to fit these dimensions

Choices: 832x480, 480x832

frames_per_second

Type: int

Default value: 16

Description: Frames per second. Note that the pricing of this model is based on the video duration at 16 fps

Range: Min: 5 | Max: 24

fast_mode

Type: enum

Default value: Balanced

Description: Speed up generation with different levels of acceleration. Faster modes may degrade quality somewhat. The speedup is dependent on the content, so different videos may see different speedups.

Choices: Balanced, Off, Fast

sample_steps

Type: int

Default value: 30

Description: Number of generation steps. Fewer steps means faster generation, at the expensive of output quality. 30 steps is sufficient for most prompts

Range: Min: 1 | Max: 40

sample_guide_scale

Type: int

Default value: 5

Description: Higher guide scale makes prompt adherence better, but can reduce variation

Range: Min: 0 | Max: 10

sample_shift

Type: int

Default value: 3

Description: Sample shift factor

Range: Min: 1 | Max: 10

lora_weights

Type: string

Default value:

Description: Load LoRA weights. Supports Replicate models in the format <owner>/<username> or <owner>/<username>/<version>, HuggingFace URLs in the format huggingface.co/<owner>/<model-name>, CivitAI URLs in the format civitai.com/models/<id>[/<model-name>], or arbitrary .safetensors URLs from the Internet. For example, 'fofr/flux-pixar-cars'

lora_scale

Type: int

Default value:

Description: Determines how strongly the main LoRA should be applied. Sane results between 0 and 1 for base inference. For go_fast we apply a 1.5x multiplier to this value; we've generally seen good performance when scaling the base value by that amount. You may still need to experiment to find the best value for your particular lora.

Range: Min: 1

seed

Type: int

Default value: 0

Description: Random seed. Leave blank to randomize the seed

Examples

Pricing

Model pricing for vmodel/wan-2.1-i2v-480p. Looking for volume pricing? Get in touch.

When

⚙ using this model

$0.5000

per use

or 2 uses for $1

Readme