ajala's Text-to-Speech service converts digital text into natural-sounding machine-generated speech in a variety of African languages. This tutorial provides a quick introduction to sending and receiving requests from the service.
Step 1: Create an account
Go to https://api.ajala.ai and create a free account. Your account will automatically be provisioned credentials for accessing ajala’s speech platform. In order to request rate limits for development testing or production use, please send an email to support@ajalaco.com.
Step 2: Get your security credentials
Copy your API Key from your account dashboard
Step 3: Synthesize speech in Tanzanian Kiswahili
Issue the following request to generate a wav file that says Jinalako nani
Ensure curl is installed on your machine, and
Replace “API_KEY” with the API key from your account
The response will provide a wav file that articulates Jinalako nani
ajala’s Text-to-Speech service provides RESTful APIs that deliver access to ajala’s African language Text-to-Speech solutions. ajala focuses on delivering high-quality, natural sounding voices that accurately capture dialect, tonal and regional variations in supported African languages. Models are trained in a low-resource context, and our first release includes a variety of concatenative voices. For updates on subsequent releases, please sign up to our newsletter.
The service can be integrated across various channels including IVR, chatbots, mobile apps and other bespoke applications. The service accepts plain text and Speech-Synthesis Markup Language (SSML). ajala provides premium accounts support with integrating our platform with bespoke solutions. Please email support@ajalaco.com for additional information.
Currently, the Text-to-Speech service supports concatenative voices in male and/or female voices in the following languages:
For more information, see Supported voices.
Currently, the Text-to-Speech service supports streaming audio in response to synthesis requests in the following MIME type formats:
For more information, see Audio formats
For Inquiries concerning premium account pricing and bespoke/custom voices, please contact sales@ajalaco.com
ajala’s Text-to-Speech (TTS) service provides the ability to synthesize speech in various African languages using ajala’s speech synthesis capabilities. The service is accessible via ajala’s REST APIs, and provides male and/or female voices in supported languages. The speech synthesis service provides an HTTP interface that accepts SSML and plain text requests.
The current version of the service authenticates access using API keys passed in the request header, as apikey. Your auth keys and service access should be managed through your user account.
The current version of the service authenticates access using API keys passed in the request header, as apikey. Your auth keys and service access should be managed through your user account.
Header
Parameters are passed to the Text-to-Speech service in the headers. The service expects the following parameters in the request header:
Supported voices
We aim to provide male and female voices in all supported languages. The table below summarizes all voices ajala currently supports.
Audio formats
The service currently supports returning audio in any of the following formats:
Data Collection
The Text-to-Speech service automatically logs all requests and responses in an anonymized manner, as part of overall system monitoring. Data from requests may be used to implement improvements to the service, and such data is not made public. The service stores data in manner that supports the European Union’s General Data Protection Regulation, and complies with ajala’s security and data privacy protocols.
The Text-to-Speech service uses standard HTTP response codes to indicate the status of the request. Unsuccessful requests are accompanied by an error_code and error_message that provide additional context around the failure.
The Text-to-Speech service supports a number of methods for initiating and managing speech synthesis requests
Voices
The Text-to-Speech service includes service endpoints for retrieving information related to currently available voices.
List Voices: /tts/service/voices (GET)
The service provides an HTTP GET endpoint that returns a JSON array of all voices ajala’s Text-to-Speech service currently supports
Sample Request
Sample Response
List Voice: /tts/service/voices/{language} (GET)
The service provides an HTTP GET endpoint that returns a JSON array of all voices ajala’s Text-to-Speech service currently supports for a given language.
Response Fields
Status Code
Synthesize audio: /tts/dynamic/client (POST)
Synthesize audio: /tts/dynamic/client (POST)
The Text-to-Speech synthesis service accepts HTTP POST synthesis requests where the text is posted in the body of the request, or as a request parameter. The maximum text size acceptable for a synthesis request is xxx.
A utf-8 encoded plain text file can be posted to the service for synthesis:
Alternatively, the text may be submitted as a parameter
Response Fields
Status Code
Sample Responses
Successful Response
Failed response: No available workers
Failed Response: Synthesis Failure
ajala's Speech-to-Text service transcribes human speech into digital text in a variety of African languages. This tutorial provides a quick introduction to sending and receiving requests from the service.
Step 1: Create an account
Go to https://api.ajala.ai and create a free account. Your account will automatically be provisioned credentials for accessing ajala’s speech platform. In order to request rate limits for development testing or production use, please send an email to support@ajalaco.com.
Step 2: Get your security credentials
Copy your API Key from your account dashboard
Step 3: Download a sample audio recording of Yoruba speech <link to file>
Step 4: Transcribe the sample Yoruba recording
Issue the following request to generate a json response that includes a transcription of the sample audio file
Ensure curl is installed on your machine, and Replace “API_KEY” with the API key from your account
A successful request will return a JSON response that resembles:
ajala’s Speech-to-Text service provides RESTful APIs that deliver access to ajala’s African language Speech-to-Text solutions. ajala focuses on delivering high-quality, transcriptions that deliver high accuracy on a variety of tonal and regional variations in supported African languages. Models are available that supported limited-vocabulary conversational speech, and context-specific entities, e.g. names, numbers, and places. Additionally, models are available for high-resolution (>16kHz) and low-resolution (8kHz) audio modalities, and transcription supports a variety of audio encodings. For updates on subsequent releases, please sign up to our newsletter.
The service can be integrated across various channels including IVR, chatbots, mobile apps and other bespoke applications. ajala provides premium accounts support with integrating our platform with bespoke solutions, as well as the possibility of customizing our acoustic and language models to bespoke use-cases. Please email support@ajalaco.com for additional information.
Currently, the Speech-to-Text service supports voice recognition in the following languages:
For more information, see Supported voices.
Currently, the Text-to-Speech service supports streaming audio in response to synthesis requests in the following MIME type formats:
For more information, see Audio formats.
For Inquiries concerning premium account pricing and bespoke/custom voices, please contact sales@ajalaco.com
ajala’s Speech-to-Text (STT) service provides the ability to transcribe speech in various African languages using ajala’s speech recognition capabilities. The service is accessible via ajala’s REST APIs, and supports a variety of audio encodings and modalities.
The current version of the service authenticates access using API keys passed in the request header, as apikey. Your auth keys and service access should be managed through your user account.
The current version of the service authenticates access using API keys passed in the request header, as apikey. Your auth keys and service access should be managed through your user account.
Examples are presented using node.js HTTP Library
Header
Parameters are passed to the Speech-to-Text service in the headers. The service expects the following parameters in the request header:
Supported voices
We aim to provide conversational and context-optimized models for counting, names, and common places in all supported languages. We also provide model varieties that support high-resolution (>16KHz) and low-resolution (8kHz) audio. The table below summarizes all models ajala currently supports
Audio formats
The service currently supports returning audio in any of the following formats:
Data Collection
The Speech-to-Speech service automatically logs all requests and responses in an anonymized manner, as part of overall system monitoring. Data from requests may be used to implement improvements to the service, and such data is not made public and is stored in compliance with ajala’s security and data privacy protocols. The service stores data in manner that supports the European Union’s General Data Protection Regulation, and complies with ajala’s security and data privacy protocols.
The Speech-to-Text service uses standard HTTP response codes to indicate the status of the request. Unsuccessful requests are accompanied by an error_code and error_message that provide additional context around the failure.
The Speech-to-Text service supports a number of methods for initiating and managing speech synthesis requests
Languages
The Speech-to-Text service includes service endpoints for retrieving information related to currently available recognition models.
List Languages: /asr/service/languages (GET)
The service provides an HTTP GET endpoint that returns a JSON array of all voices ajala’s Text-to-Speech service currently supports
Sample Request
Sample Response
List Language: /asr/service/languages/{language} (GET)
The service provides an HTTP GET endpoint that returns a JSON array of all recognition models ajala’s Speech-to-Text service currently supports for a given language.
Sample Request
Sample Response
Response Fields
Status Code
Synthesize audio: /tts/dynamic/client (POST)
The Text-to-Speech synthesis service accepts HTTP POST synthesis requests where the text is posted in the body of the request, or as a request parameter. The maximum text size acceptable for a synthesis request is xxx.
An audio file can be posted to the service along with a recognizer-id indicating the model that should be used for transcription:
Response Fields
Status Code
Sample Responses
Successful Response
Failed response: No available workers