Generate high-quality videos from text descriptions with Kling 3.0 Standard. Supports multiple aspect ratios, durations, and optional audio generation.
Property Value Provider Kling Model Kling 3.0 Standard Capability Text to Video Base Cost 84,000 micro-cents/second ($0.084/sec) Processing Time ~240 seconds
Request Body
Model slug. Use kling/kling-3.0-standard/text-to-video for text-to-video generation.
Input parameters for text-to-video generation. Text description of the video to generate (max 2500 characters).
Video aspect ratio. Default: 16:9. Options: 16:9, 9:16, 1:1.
Video duration in seconds. Default: 5. Options: 5, 10, 15.
Enable audio generation. Default: false. Options: true, false.
HTTPS URL to receive a webhook notification when the job completes or fails.
Pricing
Base cost: 84,000 micro-cents per second ($0.084/sec)
finalCost = baseCost × duration × has_sound
Factor Option Multiplier Duration 55x 1010x 1515x Sound false1x true1.5x
Default cost: 5 seconds, no sound = 84,000 × 5 × 1 = 420,000 micro-cents ($0.42)
Response
Unique identifier for the submitted job.
Initial job status. Always "pending" on successful submission.
ISO 8601 timestamp of the estimated completion time.
The cost of the job in micro-cents.
Code Examples
curl -X POST https://api.muvi.video/v1/jobs/submit \
-H "Authorization: Bearer $PIXELBYTE_API_KEY " \
-H "Content-Type: application/json" \
-d '{
"model": "kling/kling-3.0-standard/text-to-video",
"input": {
"prompt": "A golden retriever running through a sunlit meadow",
"aspect_ratio": "16:9",
"duration": "5",
"has_sound": "true"
}
}'