Talking Photo API – Turn Photos into Realistic Talking Avatars

VModel/talking-photo-turbo

API to convert photos into realistic talking avatars in seconds.

Output: $0.002 / second or 500 seconds / $1

Input

Output

{
  "task_id": "d9oo2z1s89lobg8oz5",
  "user_id": 1,
  "version": "11fee5368eda61d569f53f1b24ce1c53b06c867157cd833e9a0a97b66096f974",
  "error": null,
  "total_time": 37,
  "predict_time": 37,
  "logs": null,
  "output": [
    "https://vmodel.ai/data/model/vmodel/talking-photo-turbo/result.mp4"
  ],
  "status": "succeeded",
  "create_at": 1746492954,
  "completed_at": 1746493015,
  "input": {
    "avatar": "https://vmodel.ai/data/model/vmodel/talking-photo-turbo/demo.png",
    "speech": "https://vmodel.ai/data/model/vmodel/talking-photo-turbo/examples_wav_talk_male_law_10s.wav",
    "disable_safety_checker": false
  }
}

Generated in: 37 seconds

Download

Input

Output

{
  "task_id": "d9oo2z1s89lobg8oz5",
  "user_id": 1,
  "version": "11fee5368eda61d569f53f1b24ce1c53b06c867157cd833e9a0a97b66096f974",
  "error": null,
  "total_time": 37,
  "predict_time": 37,
  "logs": null,
  "output": [
    "https://vmodel.ai/data/model/vmodel/talking-photo-turbo/result.mp4"
  ],
  "status": "succeeded",
  "create_at": 1746492954,
  "completed_at": 1746493015,
  "input": {
    "avatar": "https://vmodel.ai/data/model/vmodel/talking-photo-turbo/demo.png",
    "speech": "https://vmodel.ai/data/model/vmodel/talking-photo-turbo/examples_wav_talk_male_law_10s.wav",
    "disable_safety_checker": false
  }
}

Generated in: 37 seconds

Download

HTTP Request

Run vmodel/talking-photo-turbo:11fee5368eda61d569f53f1b24ce1c53b06c867157cd833e9a0a97b66096f974 using Vmodel's HTTP API.

  curl -X POST https://api.vmodel.ai/api/tasks/v1/create
    -H "Authorization: Bearer $VModel_API_TOKEN"
    -H "Content-Type: application/json"
    -d '{
    "version": "11fee5368eda61d569f53f1b24ce1c53b06c867157cd833e9a0a97b66096f974",
    "input": {}
}'

Input Schema

The fields you can use to run this model with an API. If you don't give a value for a field its default value will be used.

avatar

Type: image

Default value: -

Description: Image url address

speech

Type: audio

Default value: -

Description: Audio url address

Examples

Pricing

Model pricing for vmodel/talking-photo-turbo. Looking for volume pricing? Get in touch.

When

⚙ using this model

$0.0020

per second of input audio

or 500 seconds for $1

Readme