VModel/talking-photo-turbo-pro
API to convert photos into realistic talking avatars in seconds.
Output: $0.012 / second or 83 seconds / $1
Input
avatar * image
Image url address
input image
speech * audio
Audio url address
Audio File
resolution enum
The maximum side length of the output video is 720 and 480, which can be omitted and the default is 720
disable_safety_checker boolean
Note: The website version of this model always runs with safety checks enabled. For details,see VModel's platform safety guidelines..
Disable safety checker for generated images
Reset
Output
{
  "task_id": "d9oo2z1s89lobg8oz5",
  "user_id": 1,
  "version": "ae74513f15f2bb0e42acf4023d7cd6dbddd61242c5538b71f830a630aacf1c9d",
  "error": null,
  "total_time": 37,
  "predict_time": 37,
  "logs": null,
  "output": [
    "https://vmodel.ai/data/model/vmodel/talking-photo-turbo-pro/result.mp4"
  ],
  "status": "succeeded",
  "create_at": 1746492954,
  "completed_at": 1746493015,
  "input": {
    "avatar": "https://vmodel.ai/data/model/vmodel/talking-photo-turbo-pro/demo.webp",
    "speech": "https://vmodel.ai/data/model/vmodel/talking-photo-turbo-pro/examples_wav_talk_male_law_10s.wav",
    "disable_safety_checker": false,
    "resolution": "480"
  }
}
Generated in: 37 seconds
Download
Examples
Pricing
Model pricing for vmodel/talking-photo-turbo-pro. Looking for volume pricing? Get in touch.
When
model variant is 480
$0.012000
per generation at 480
or 83 generations for $1
When
model variant is 720
$0.025000
per generation at 720
or 40 generations for $1
Readme

Loading...