Speech Service

Text to Speech

Get Started

Quick Tutorial

ajala's Text-to-Speech service converts digital text into natural-sounding machine-generated speech in a variety of African languages.  This tutorial provides a quick introduction to sending and receiving requests from the service.

Step 1: Create an account

Go to https://api.ajala.ai and create a free account. Your account will automatically be provisioned credentials for accessing ajala’s speech platform. In order to request rate limits for development testing or production use, please send an email to support@ajalaco.com.

Step 2: Get your security credentials

Copy your API Key from your account dashboard

Step 3: Synthesize speech in Tanzanian Kiswahili

Issue the following request to generate a wav file that says Jinalako nani
        Ensure curl is installed on your machine, and
        Replace “API_KEY” with the API key from your account

curl -X POST \
    -H 'voice-id':'hau_NG_female_1' \
    -H 'api-key':'<API_KEY>' \
    -H 'content-id':'text/plain' \
    -H 'audio-format':'audio/wav' \
    -H 'ascii-encoded':'false' \
    http://api.ajala.ai/tts/dynamic/client?text=mùsáaƙîn%20ƙòoƙòo%20zàamàaníi%20yéèsù

The response will provide a wav file that articulates Jinalako nani

About

ajala’s Text-to-Speech service provides RESTful APIs that deliver access to ajala’s African language Text-to-Speech solutions. ajala focuses on delivering high-quality, natural sounding voices that accurately capture dialect, tonal and regional variations in supported African languages. Models are trained in a low-resource context, and our first release includes a variety of concatenative voices. For updates on subsequent releases, please sign up to our newsletter.

The service can be integrated across various channels including IVR, chatbots, mobile apps and other bespoke applications. The service accepts plain text and Speech-Synthesis Markup Language (SSML). ajala provides premium accounts support with integrating our platform with bespoke solutions. Please email support@ajalaco.com for additional information.

Language Support

Currently, the Text-to-Speech service supports concatenative voices in male and/or female voices in the following languages:

  • Hausa
  • Igbo
  • Kiswahili
  • Kinyarwanda
  • Yoruba

For more information, see Supported voices.

Audio Formats

Currently, the Text-to-Speech service supports streaming audio in response to synthesis requests in the following MIME type formats:

  • mulaw
  • wav
  • mp3

For more information, see Audio formats

Pricing

For Inquiries concerning premium account pricing and bespoke/custom voices, please contact sales@ajalaco.com

Introduction

Last updated: 2021-03-24

ajala’s Text-to-Speech (TTS) service provides the ability to synthesize speech in various African languages using ajala’s speech synthesis capabilities. The service is accessible via ajala’s REST APIs, and provides male and/or female voices in supported languages. The speech synthesis service provides an HTTP interface that accepts SSML and plain text requests.

Authentication

The current version of the service authenticates access using API keys passed in the request header, as apikey. Your auth keys and service access should be managed through your user account.

Requests

The current version of the service authenticates access using API keys passed in the request header, as apikey. Your auth keys and service access should be managed through your user account.

curl -X POST \    
    -T test/hau_test.txt \    
    -H 'voice-id':'hau_NG_female_1' \    
    -H 'auth-key':'<AUTH_KEY>' \    
    -H 'device-id':'<SSID>' \    
    -H 'content-id':'text/plain' \    
    -H 'audio-format':'audio/wav' \    
    -H 'ascii-encoded':'true' \    
http://api.ajala.ai/tts/dynamic/client
Tip: AUTH_KEY and SSID should be replaced by your account credentials.
import sys, os
import json
ttsURL = "http://api.ajala.ai/tts?apikey=<API_KEY>&text="
ttsVoice = "&voice=Yoruba+Nigerian+male&fmt=wav"
textFileDir = '/path/to/location/of/text/file/to/synthesize'
textFileName = '<EXAMPLE_TEXT.txt>'
jsonFileName = '<JSON_FILE_NAME>'
outputJSONPath = 'path/to/save/json/file/generated/as/a/result/of/POST/request'
outputAudioFileName = '<AUDIO_FILE_NAME>'
outputAudioPath = 'path/to/save/wav/file/extracted/from/JSON/response/to/POST/request'

#Get contents of textFileName (text to be synthesized, saved in a txt file)
with open(textFileName) as f:
    textToSynthesize = f.read()
cmd = 'curl "' + ttsURL + textToSynthesize + '.' + ttsVoice + '" > "' + outputJSONPath + jsonFileName + '.json"'
try:
    f = open(outputJSONPath + outputFileName +'.json','r')
    try:
        data = json.load(f)
        audioPath = outputAudioPath +'/'+ outputAudioFileName + '.wav'
    with open(audioPath,'wb') as ff:
        ff.write(base64.b64decode(data['data']['sound_base64']))
        ff.close()
    except:
        data['success'] = 'False'      
except:
    data['success'] = 'False'
Tip: AUTH_KEY and SSID should be replaced by your account credentials.
function callAPI(apiURL){
    var XMLHttpRequest = require("xmlhttprequest").XMLHttpRequest;
    var Httpreq = new XMLHttpRequest();
    Httpreq.open("GET",apiURL,false);
    Httpreq.send(null);
    return Httpreq.responseText;
}
var apiKey = "API_KEY"
var apiURL = "http://api.ajala.ai/tts";

//Initialize query parameters
var ttsVoice = "Kiswahili_male_Tanzania_demo";
var audioFormat = "wav";
var sampleText = "Jina lako nani";

//Construct POST request
var ttsParams = "&text="+sampleText.replace(/\s/g,"+")+"&voice="+ttsVoice.replace(/\s/g,"+")+"&fmt="+audioFormat;
apiURL = apiURL + "?apikey=" + apiKey + ttsParams
console.log(apiURL) //validate URL before executing request

//Execute POST request
var jsonResponse = JSON.parse(callAPI(apiURL));
console.log(jsonResponse.success)
Tip: AUTH_KEY and SSID should be replaced by your account credentials.

Header

Parameters are passed to the Text-to-Speech service in the headers. The service expects the following parameters in the request header:

Field Description
auth-key Your account’s API key
device-id Your account’s SSID
content-id Content type, i.e. text/plain
voice-id Id for the voice that should be used for the synthesis request
audio-format MIME type for binary encoded synthesized audio in response, e.g. audio/wav

Supported voices

We aim to provide male and female voices in all supported languages. The table below summarizes all voices ajala currently supports.

Language voice-id Gender Description
Yoruba yor-NG_SadeIjesaV1 Female
Kiswahili ksw-TZ_JohnDemoV1 Male
Kinyarwanda kin-RW_FemaleV1 Female
Kinyarwanda kin-RW_MaleV1 Male
Igbo igb-NG_NgoziV1 Female
Hausa hau-NG_HafsatV1 Female
Tip: Send a request to support@ajalaco.com for additional languages you would like us to support

Audio formats

The service currently supports returning audio in any of the following formats:

  • audio/mulaw (8khz)
  • audio/wav (16khz)
  • audio/mp3 (16khz)
Tip: Send a request to support@ajalaco.com for additional MIME types you would like us to support

Data Collection

The Text-to-Speech service automatically logs all requests and responses in an anonymized manner, as part of overall system monitoring. Data from requests may be used to implement improvements to the service, and such data is not made public. The service stores data in manner that supports the European Union’s General Data Protection Regulation, and complies with ajala’s security and data privacy protocols.

Tip: To discuss disabling logging for your account, please contact support@ajalaco.com

Error Handling

The Text-to-Speech service uses standard HTTP response codes to indicate the status of the request. Unsuccessful requests are accompanied by an error_code and error_message that provide additional context around the failure.

Field Description
error_code Integer server assigns to specific error scenarios. This is separate to the HTTP response code.
error_message A description of the related error; otherwise, null.
Tip: HTTP response codes 2xx indicate success; 4xx responses indicate failure; 5xx indicate an internal system error.

Methods

The Text-to-Speech service supports a number of methods for initiating and managing speech synthesis requests

Method Endpoint Description
Voices /tts/service/voices Returns a JSON array of all supported voices
/tts/service/voices/{language} Returns a JSON array of supported voices for language
Synthesis /tts/dynamic/client Returns a JSON including synthesized speech and related metadata

Voices

The Text-to-Speech service includes service endpoints for retrieving information related to currently available voices.

List Voices: /tts/service/voices (GET)

The service provides an HTTP GET endpoint that returns a JSON array of all voices ajala’s Text-to-Speech service currently supports

Sample Request

curl -X GET {-H apiKey:<API_KEY>} \
        https://api.ajala.ai/tts/service/voices

Sample Response

{  
    "voices": {
        "ksw_KE": [
          {
            "name": "ksw_TZ-JohnDemo",
            "gender": "male",
            "description": "John: Tanzanian Swahili male voice (demo only)"
          }
      ],
        "yor_NG": [
          {
           "name": "yor_NG-SadeIjesaV1",
            "gender": "female",
            "description": "Sade: Nigerian Ijesa Yoruba female voice"
          }
       ],
        "hau_NG": [
          {
            "name": "hau_NG-HafsatV1",
            "gender": "female",
            "description": "Hafsat: Nigerian Hausa female voice"
          }
       ],
        "igb_NG": [
          {
            "name": "igb_NG-NgoziV1",
            "gender": "female",
            "description": "Ngozi: Nigerian Igbo female voice"
           }
         ],
        "kin_RW": [
          {
            "name": "kin_RW-ClaudeV1",
            "gender": "male",
            "description": "Claude: Rwandan Kinyarwanda male voice"
          },
          {
            "name": "kin_RW-JosephineV1",
            "gender": "female",
            "description": "Josephine: Rwandan Kinyarwanda female voice"
          }
        ]
      }
    }

List Voice: /tts/service/voices/{language} (GET)

The service provides an HTTP GET endpoint that returns a JSON array of all voices ajala’s Text-to-Speech service currently supports for a given language.

{
      "voices": {
        "ksw_KE": [
          {
            "name": "ksw_TZ-JohnDemo",
            "gender": "male",
            "description": "John: Tanzanian Swahili male voice (demo only)"
          }
       ]
    }

Response Fields

field Description
name A unique identifier for each voice
gender The gender of the voice
description A summary of the voice, including country, regional or other features unique to the voice

Status Code

Status Code Description
200 Request successful
400 Bad Request
403 Unsupported language

Synthesize audio: /tts/dynamic/client (POST)

Synthesize audio: /tts/dynamic/client (POST)
The Text-to-Speech synthesis service accepts HTTP POST synthesis requests where the text is posted in the body of the request, or as a request parameter. The maximum text size acceptable for a synthesis request is xxx.

A utf-8 encoded plain text file can be posted to the service for synthesis:

curl -X POST \
   -T test/hau_test_utf.txt \
   -H 'voice-id':'hau_NG_female_1' \
   -H 'device-id':'<SSID>' \
   -H 'content-id':'text/plain' \
   -H 'audio-format':'audio/wav' \
   -H 'ascii-encoded':'false' \
   http://api.ajala.ai/tts/dynamic/client

Alternatively, the text may be submitted as a parameter

curl -X POST \
    -H 'voice-id':'hau_NG_female_1' \
    -H 'device-id':'<SSID>' \
    -H 'content-id':'text/plain' \
    -H 'audio-format':'audio/wav' \
    -H 'ascii-encoded':'false' \
    http://api.ajala.ai/tts/dynamic/client?text=mùsáaƙîn%20ƙòoƙòo%20zàamàaníi%20yéèsù

Response Fields

Field Description
voice_id The voice name
account_sid User-generated ID submitted with the request
api_version The version of the API invoked in the request
date_created Date the synthesized audio was created
api_version The version of the API invoked in the request
synth.attributes Attributes of synthesized audio file, i.e. channel_count, sample_count, sample_coding, sample_n_butes, sample_byte_format
synth.status Synthesis process status (0 – Successful)
synth.audio-format Audio format submitted with the request
synth.id Unique id associated with the request
synth.encoded_audio Binary encoded synthesized audio
error_code Integer server assigns to specific error scenarios. This is separate to the HTTP response code.
error_message A description of the related error; otherwise, null.

Status Code

Status Code Description
200 Request successful
400 Bad Request
415 Unsupported media type
500 Internal Server Error
503 Service unavailable

Sample Responses

Successful Response

{
        "voice_id": "hau_NG_female_1",
        "account_sid": "test_ssid_1_postman",
        "api_version": "v0.0.1",
        "date_created": "2021-02-03 10:49:01",
        "synth":{
            "sph-audio":"TklTVF8xQQogICAxMDI0CmNoYW5uZWxfY29....",
             "attributes": {            "channel_count": "1",            "sample_count": "47356",
                "sample_rate": "16000",
                "sample_coding": "pcm",
                "sample_n_bytes": "2",
                "sample_byte_format": "01"
            },
            "status": 0,
            "audio-format": "audio/wav",
            "id": "d2bf293b-1d7d-4962-b71d-cfaab6a04ed3",                     "encoded_audio":"/35+fv/////+fv//fv//////////fv9+fn7//////v////7//////37///////////7//v/+////...."
            "error_code": 0,
            "error_message": null
        }
    }

Failed response: No available workers

{
        "voice-id": "yor_NG_female_1",
        "account_sid": "test_ssid_1_postman",
        "api_version": "v0.0.1",    "date_created": "2021-02-03 14:01:05",
        "synth": {
            "status": 0,
            "id": "ee990ab9-468c-456f-b74a-111324aa7765",
            "error_message": "No yor_NG_female_1 workers available",
            "error_code":1
        }
    }

Failed Response: Synthesis Failure

{
        "voice_id": "hau_NG_female_1",
        "account_sid": "test_ssid_3_postman",
        "api_version": "v0.0.1",
        "date_created": "2021-02-03 15:22:18",
        "synth": {
            "status": 0,
            "id": "aa6c0dc5-b18c-4ee7-b3dd-aaf814546309",
            "error_message": "Synthesis request failed",
            "error_code": 0
        }
}

Speech to Text

Get Started

Quick Tutorial

ajala's Speech-to-Text service transcribes human speech into digital text in a variety of African languages. This tutorial provides a quick introduction to sending and receiving requests from the service.

Step 1: Create an account

Go to https://api.ajala.ai and create a free account. Your account will automatically be provisioned credentials for accessing ajala’s speech platform. In order to request rate limits for development testing or production use, please send an email to support@ajalaco.com.

Step 2: Get your security credentials

Copy your API Key from your account dashboard

Step 3: Download a sample audio recording of Yoruba speech <link to file>

Step 4: Transcribe the sample Yoruba recording

Issue the following request to generate a json response that includes a transcription of the sample audio file
Ensure curl is installed on your machine, and Replace “API_KEY” with the API key from your account

curl -X POST \
 -T yoruba-test.wav \
 -H 'device-id:test_device' \
 -H 'recognizer-id:yorHighRes' \
 http://api.ajala.ai/asr/dynamic/client

A successful request will return a JSON response that resembles:

{
      "date_created": "2021-03-27 17:26:52",
      "recog": {
        "status": 0,
        "hypotheses": [
          {
            "confidence": 10000000000,
            "utterance": "ẹnyin ọrẹ́ẹ mi ẹ bí òwe bí òwe ni à ńlu ìlù ògìdìgbó."
          }
        ],
        "id": "2c87cb2e-c694-49c8-b9e5-77d92072d4bb"
      },
      "lang-id": "yorHighRes",
      "api_version": "v0.0.1",
      "account_sid": "test_device"
    }

About

ajala’s Speech-to-Text service provides RESTful APIs that deliver access to ajala’s African language Speech-to-Text solutions. ajala focuses on delivering high-quality, transcriptions that deliver high accuracy on a variety of tonal and regional variations in supported African languages. Models are available that supported limited-vocabulary conversational speech, and context-specific entities, e.g. names, numbers, and places. Additionally, models are available for high-resolution (>16kHz) and low-resolution (8kHz) audio modalities, and transcription supports a variety of audio encodings. For updates on subsequent releases, please sign up to our newsletter.

The service can be integrated across various channels including IVR, chatbots, mobile apps and other bespoke applications. ajala provides premium accounts support with integrating our platform with bespoke solutions, as well as the possibility of customizing our acoustic and language models to bespoke use-cases. Please email support@ajalaco.com for additional information.

Language Support

Currently, the Speech-to-Text service supports voice recognition in the following languages:

  • Hausa
  • Igbo
  • Kiswahili
  • Kinyarwanda
  • Yoruba

For more information, see Supported voices.

Audio Formats

Currently, the Text-to-Speech service supports streaming audio in response to synthesis requests in the following MIME type formats:

  • raw/pcm
  • wav
  • mp3
  • ogg

For more information, see Audio formats.

Pricing

For Inquiries concerning premium account pricing and bespoke/custom voices, please contact sales@ajalaco.com

Introduction

Last updated: 2021-03-29

ajala’s Speech-to-Text (STT) service provides the ability to transcribe speech in various African languages using ajala’s speech recognition capabilities. The service is accessible via ajala’s REST APIs, and supports a variety of audio encodings and modalities.

Authentication

The current version of the service authenticates access using API keys passed in the request header, as apikey. Your auth keys and service access should be managed through your user account.

Requests

The current version of the service authenticates access using API keys passed in the request header, as apikey. Your auth keys and service access should be managed through your user account.

curl -X POST \
    -T <audio-file>\
           -H 'apikey: <AUTH_KEY>' \
    -H 'device-id:test_device' \
    -H 'recognizer-id:yor8k_names' \
    http://api.ajala.ai/asr/dynamic/client
Tip: AUTH_KEY and SSID should be replaced by your account credentials.
import sys, os
import json

asrURL = "http://api.ajala.ai/asr/dynamic/client?apikey=<API_KEY>&recognizer_id="
asrContext = "yor8k_names"
jsonFileName = '<JSON_FILE_NAME.json>'
jsonDir = "/path/to/location/to/save/output/json/file"
audioFileDir = '/path/to/location/of/audio/file/to/transcribe'
audioFileName = '<EXAMPLE_AUDIO.wav>
'transcriptionFileDir = '/path/to/location/of/transcription/of/audio/file'

cmd = 'curl -T "' + audioFileDir + '/' + audioFileName + '"
http://api.ajala.ai/client/dynamic/recognize?<API_KEY>&recognizer_id=yor8k_names > "' + jsonDir + jsonFileName+ '.json"'
f = open(jsonDir +'/'+ jsonFileName,'r')
data = json.load(f)
transcription = data['hypotheses'][0]['utterance'].replace(".","").split(" ")
fout = open(transcriptionFileDir +'/'+ jsonFileName + ".txt",'w')
fout.write(transcription)

Examples are presented using node.js HTTP Library

function callAPI(apiURL){
                var XMLHttpRequest = require("xmlhttprequest").XMLHttpRequest;
                var Httpreq = new XMLHttpRequest();
                Httpreq.open("GET",apiURL,false);
                Httpreq.send(null);
                return Httpreq.responseText;
}
var apiKey = "API_KEY"
var apiURL = "http://api.ajala.ai/asr/dynamic/client";

//Initialize query parameters
var asrModel = "yorHighRes";
var audioFile = "<audio-file-name>";

//Construct POST request
asrURL = asrURL + "?apikey=" + apiKey + "&recognizer_id="+asrContext
console.log(apiURL) //validate URL before executing request

//Execute POST request
var jsonResponse = callASRAPI(asrURL, audioFileName);
console.log(jsonResponse.success)

Header

Parameters are passed to the Speech-to-Text service in the headers. The service expects the following parameters in the request header:

Field Description
apikey Your account’s API key
device-id Your account’s SSID
recognizer-id Id for the model that should be used for the recognition request

Supported voices

We aim to provide conversational and context-optimized models for counting, names, and common places in all supported languages. We also provide model varieties that support high-resolution (>16KHz) and low-resolution (8kHz) audio. The table below summarizes all models ajala currently supports

Language voice-id Sample Frquency Description
Yoruba yor-NG_HighRes >= 16kHz Conversational speech (limited vocabulary)
yor-NG_NamesHighRes Common names
yor-NG_CountingHighRes Number sequences
yor-NG_PlacesHighRes Common places
yor-NG_LowRes 8kHz Conversational speech (limited vocabulary)
yor-NG_NamesLowRes Common names
yor-NG_CountingLowRes Number sequences
yor-NG_PlacesLowRes Common places
Kiswahili ksw-KE_HighRes >= 16kHz Conversational speech (limited vocabulary)
ksw-KE_NamesHighRes Common names
ksw-KE_CountingHighRes Number sequences
ksw-KE_PlacesHighRes Common places
ksw-KE_LowRes 8kHz Conversational speech (limited vocabulary)
ksw-KE_NamesLowRes Common names
ksw-KE_CountingLowRes Number sequences
ksw-KE_PlacesLowRes Common places
Kinyarwanda kin-RW_HighRes >= 16kHz Conversational speech (limited vocabulary)
kin-RW_NamesHighRes Common names
kin-RW_CountingHighRes Number sequences
kin-RW_PlacesHighRes Common places
kin-RW_LowRes 8kHz Conversational speech (limited vocabulary)
kin-RW_NamesLowRes Common names
kin-RW_CountingLowRes Number sequences
kin-RW_PlacesLowRes Common places
Igbo igb-NG_HighRes >= 16kHz Conversational speech (limited vocabulary)
igb-NG_NamesHighRes Common names
igb-NG_CountingHighRes Number sequences
igb-NG_PlacesHighRes Common places
igb-NG_LowRes 8kHz Conversational speech (limited vocabulary)
igb-NG_NamesLowRes Common names
igb-NG_CountingLowRes Number sequences
igb-NG_PlacesLowRes Common places
Hausa hau-NG_HighRes >= 16kHz Conversational speech (limited vocabulary)
hau-NG_NamesHighRes Common names
hau-NG_CountingHighRes Number sequences
hau-NG_PlacesHighRes Common places
hau-NG_LowRes 8kHz Conversational speech (limited vocabulary)
hau-NG_NamesLowRes Common names
hau-NG_CountingLowRes Number sequences
hau-NG_PlacesLowRes Common places
Tip: Send a request to support@ajalaco.com for additional languages you would like us to support

Audio formats

The service currently supports returning audio in any of the following formats:

  • audio/mulaw (8khz)
  • audio/wav (16khz)
  • audio/mp3 (16khz)
Tip: Send a request to support@ajalaco.com for additional MIME types you would like us to support

Data Collection

The Speech-to-Speech service automatically logs all requests and responses in an anonymized manner, as part of overall system monitoring. Data from requests may be used to implement improvements to the service, and such data is not made public and is stored in compliance with ajala’s security and data privacy protocols. The service stores data in manner that supports the European Union’s General Data Protection Regulation, and complies with ajala’s security and data privacy protocols.

Tip: To discuss disabling logging for your account, please contact support@ajalaco.com

Error Handling

The Speech-to-Text service uses standard HTTP response codes to indicate the status of the request. Unsuccessful requests are accompanied by an error_code and error_message that provide additional context around the failure.

Field Description
error_code Integer server assigns to specific error scenarios. This is separate to the HTTP response code.
error_message A description of the related error; otherwise, null.
Tip: HTTP response codes 2xx indicate success; 4xx responses indicate failure; 5xx indicate an internal system error.

Methods

The Speech-to-Text service supports a number of methods for initiating and managing speech synthesis requests

Method Endpoint Description
Languages /asr/service/languages Returns a JSON array of all supported languages
/asr/service/languages/{language} Returns a JSON array of models for language
Recognition /tts/dynamic/client Returns a JSON including transcribed speech and related metadata

Languages

The Speech-to-Text service includes service endpoints for retrieving information related to currently available recognition models.

List Languages: /asr/service/languages (GET)

The service provides an HTTP GET endpoint that returns a JSON array of all voices ajala’s Text-to-Speech service currently supports

Sample Request

curl -X GET {-H apiKey:<API_KEY>} \
        https://api.ajala.ai/asr/service/languages

Sample Response

{
"igb_NG": "Nigerian Igbo",
"yor_NG": "Nigerian Yoruba",
"kin_RW": "Rwandan Kinyarwanda",
"hau_NG": "Nigerian Hausa",
"ksw_KE": "Kenyan Kiswahili"
}

List Language: /asr/service/languages/{language} (GET)

The service provides an HTTP GET endpoint that returns a JSON array of all recognition models ajala’s Speech-to-Text service currently supports for a given language.

Sample Request

curl -X GET {-H apiKey:<API_KEY>} \
        https://api.ajala.ai/asr/service/languages/ksw

Sample Response

{
        "ksw_KE": [
         {
            "name": "ksw-KE_HighRes",
            "description": "Conversational Kenyan Kiswahili (>=16Khz)"
          },
          {
            "name": "ksw-KE_NamesHighRes",
            "description": "Names in Kenyan Kiswahili (>=16Khz)"
          },
          {
            "name": "ksw-KE_CountingHighRes",
            "description": "Counting in Kenyan Kiswahili (>=16Khz)"
          },
          {
            "name": "ksw-KE_PlacesHighRes",
            "description": "Places in Kenyan Kiswahili (>=16Khz)"
         },
         {
            "name": "ksw-KE_LowRes",
            "description": "Conversational Kenyan Kiswahili (8Khz)"
          },
          {
            "name": "ksw-KE_NamesLowRes",
            "description": "Names in Kenyan Kiswahili (8Khz)"
          },
          {
            "name": "ksw-KE_CountingLowRes",
            "description": "Counting in Kenyan Kiswahili (8Khz)"
          },
          {
            "name": "ksw-KE_PlacesLowRes",
            "description": "Places in Kenyan Kiswahili (8Khz)"
          },
        ]
    }

Response Fields

field Description
name A unique identifier for each recognition model
description A summary of the recognition model, including country, audio modality and/or other features unique to the recognition model

Status Code

Status Code Description
200 Request successful
400 Bad Request
500 Service unavailable

Synthesize audio: /tts/dynamic/client (POST)

The Text-to-Speech synthesis service accepts HTTP POST synthesis requests where the text is posted in the body of the request, or as a request parameter. The maximum text size acceptable for a synthesis request is xxx.

An audio file can be posted to the service along with a recognizer-id indicating the model that should be used for transcription:

curl -X POST \
 -T <audio-file>\
 -H 'apikey: <AUTH_KEY>' \
 -H 'device-id:test_device' \
 -H 'recognizer-id:yor8k_names' \
 http://api.ajala.ai/asr/dynamic/client

Response Fields

Field Description
recognizer-id The name of the transcription model
account_sid User-generated ID submitted with the request
api_version The version of the API invoked in the request
date_created Date the transcription was created
recog.status Process status (0 – Successful)
recog.hypotheses Hypotheses of estimated transcription, comprising a confidence level (0-1) and utterance
recog.id Unique id associated with the request
recog.error_code Integer server assigns to specific error scenarios. This is separate to the HTTP response code.
recog.error_message A description of the related error; otherwise, null.

Status Code

Status Code Description
200 Request successful
400 Bad Request
500 Internal Server Error
503 Service unavailable

Sample Responses

Successful Response

{
      "date_created": "2021-03-27 17:26:52",
      "recog": {
        "status": 0,
        "hypotheses": [
          {
            "confidence": 10000000000,
            "utterance": "Ọláyínká."
          }
        ],
        "id": "2c87cb2e-c694-49c8-b9e5-77d92072d4bb"
      },
      "lang-id": "yor8k_names",
      "api_version": "v0.0.1",
      "account_sid": "test_device"
}

Failed response: No available workers

{
      "date_created": "2021-03-27 17:26:41",
      "recog": {
        "status": 503,
        "error_message": "No yor8k workers available",
        "error_code": 1,
        "id": "9f089d9a-842b-4a49-be41-40feb9ca4c7f"
      },
      "lang-id": "yor8k",
      "api_version": "v0.0.1",
      "account_sid": "test_device"
}