Integrate AI-powered transcription and subtitle generation into your apps, bots, and workflows. Get word-level timestamps, multi-language support, and TikTok-style ASS subtitles via a simple REST API.
Get up and running with the FastCaption API in under 2 minutes.
POST /api/v1/transcribe with your API key as Bearer token.curl -X POST https://fastcaption.app/api/v1/transcribe \ -H "Authorization: Bearer fc-YOUR_API_KEY" \ -F "audio=@./test.mp3" \ -F "language=th"
All API requests require a valid API key sent as a Bearer token in the Authorization header.
Include your API key in every request using the Authorization header:
Authorization: Bearer fc-YOUR_API_KEY
⚠️ Keep your API key secret. Do not expose it in client-side code or public repositories. If compromised, delete the key immediately from your Dashboard and create a new one.
Two endpoints available — transcription (uses credits) and ASS generation (free).
Upload an audio or video file and receive a transcription with word-level timestamps. Supports both transcription (auto-detect speech) and alignment (match your script to audio). Credits are deducted based on audio duration. Failed jobs are automatically refunded.
| Parameter | Type | Required | Description |
|---|---|---|---|
| audio | File | Required | Audio or video file (MP3, WAV, M4A, MP4, etc.). Max 50MB. |
| language | string | Optional | Language code, e.g. "th", "en", "ja", "ko", "zh". Default: "th" |
| mode | string | Optional | "transcribe" (default) or "align" — align requires scriptText |
| scriptText | string | Optional | Your own transcript text. Required when mode is "align" |
| timestampMode | string | Optional | "chunk" (default) or "word" — word-level timestamps |
| assMaxChars | number | Optional | Max characters per subtitle line. Default: 24 |
| assMode | string | Optional | "smart" (default), "pause", or "word" — subtitle splitting mode |
| assOrientation | string | Optional | "portrait" (default) or "landscape" |
curl -X POST https://fastcaption.app/api/v1/transcribe \ -H "Authorization: Bearer fc-YOUR_API_KEY" \ -F "audio=@./my-audio.mp3" \ -F "language=th" \ -F "timestampMode=word"
{
"success": true,
"jobId": "cm3xyz...",
"creditsUsed": 200,
"balanceAfter": 4800,
"result": {
"text": "สวัสดีครับ ยินดีต้อนรับ...",
"segments": [
{
"start": 0.0,
"end": 2.45,
"text": "สวัสดีครับ",
"words": [
{ "word": "สวัสดี", "start": 0.0, "end": 1.2 },
{ "word": "ครับ", "start": 1.3, "end": 2.45 }
]
}
],
"language": "th"
}
}{
"error": "Insufficient credits",
"creditsNeeded": 500,
"balance": 200
}Convert a transcription JSON (from the transcribe endpoint) into a TikTok-style ASS subtitle file. This endpoint is free — no credits are consumed. It runs CPU-only processing. Accepts either a JSON file upload (multipart) or a JSON body.
| Parameter | Type | Required | Description |
|---|---|---|---|
| jsonFile | File | Required* | JSON file from transcription result (multipart mode) |
| json | object | Required* | JSON body with segments (JSON mode) |
| assMode | string | Optional | "smart" (default), "pause", or "word" |
| orientation | string | Optional | "portrait" (default) or "landscape" |
| maxChars | number | Optional | Max characters per line. Default: 24 |
| language | string | Optional | Language code. Default: "th" |
* Either jsonFile (multipart upload) or json (JSON body) is required — not both.
curl -X POST https://fastcaption.app/api/v1/ass \ -H "Authorization: Bearer fc-YOUR_API_KEY" \ -F "jsonFile=@./result.json" \ -F "assMode=smart" \ -F "orientation=portrait" \ -F "maxChars=24"
{
"success": true,
"ass": "[Script Info]\nTitle: FastCaption...",
"captionCount": 42
}Standard HTTP status codes are used. All error responses include a JSON body with an error field.
| Status | Meaning | Common Cause |
|---|---|---|
| 200 | Success — transcription result returned | Successful request |
| 400 | Bad Request — missing or invalid parameters | Missing audio file, invalid JSON, etc. |
| 401 | Unauthorized — invalid or missing API key | No Bearer token or invalid key |
| 402 | Payment Required — insufficient credits | Top up credits to continue |
| 500 | Internal Server Error — transcription failed | Server error, credits refunded |
The API uses the same credit system as the web app. Credits are deducted based on audio duration.
Credits are calculated per second of audio. The formula is:
This means 1,000 credits ≈ 5 minutes of audio. New accounts get 5,000 free credits (~25 minutes).
| Audio Duration | Credits Used | Cost (THB) |
|---|---|---|
| 1 minute | 200 | ~฿4 |
| 5 minutes | 1,000 | ~฿20 |
| 30 minutes | 6,000 | ~฿120 |
| 1 hour | 12,000 | ~฿240 |
The /api/v1/ass endpoint is completely free — no credits needed.
Failed transcriptions are automatically refunded.
View credit packs →
The API is rate-limited by your credit balance. There are no per-minute request limits, but the following constraints apply: