This repository stores the definitions and generated code for Speechly public APIs.
There are also higher-level client libraries available for selected platforms, which contain microphone and audio management functions, as well as the connection state management that otherwise would be needed separately on top of these definitions. See Speechly Client Libraries for more information about these.
Protocol buffers definitions are located in proto/. The actual code generation is done with prototool. The supported languages are:
Protobuf stub generation is pretty easy, so if you need support for a language not in the list, you can always generate the stubs separately.
Make sure to check language-specific READMEs.
See the language specific examples in the respective subdirectories for more detailed description about using the generated code. The following describes the basic API flow of a Speechly client, which sends speech to the API and receives results at the same time.
An API Reference is generated from the protobuf source files, which contains detailed documentation about the APIs.
All gRPC connections to Speechly APIs must use secure channels, meaning that the connection is done using TLS encryption. The secure channel should be opened to api.speechly.com:443
. This channel can then be used to access all of the APIs.
The first step in connecting to the Speechly API is to call speechly.identity.v2.IdentityAPI
and create an access token to use for the future calls.
LoginRequest
and add:
device_id
, a device identifier that the API can use to match the microphone acoustic profileapp_id
to select a specific Speechly application to use, orproject_id
to use a project, containing multiple applicationsspeechly.identity.v2.IdentityAPI/Login
(the stubs help here)LoginResponse
will contain an access token, and expiry information. A new access token should be fetched before the expiration to prevent unnecessary errors.The IdentityAPI/Login
is the only API call which does not require authentication metadata. All other API's require that the access token received from Login
is attached to the request metadata with key authorization
and value Bearer TOKEN
(replace TOKEN with the actual token).
If the token is expired or otherwise invalid, all API calls will terminate with gRPC status code PERMISSION_DENIED
. A reason is included in the error details.
The token will expire after a certain amount of time, stated in the LoginResponse
message. It is still a good idea to keep the once-received token and reuse it for multiple connections, and refresh it only when it is close to expiration. This will make the API calls as fast as possible.
The speechly.slu.v1.SLU/Stream
is used to send audio in, and receive results based on the target Speechly application configuration. An access token from IdentityAPI
is required to access the SLU
.
A generic example of an SLU
connection:
speechly.slu.v1.SLU/Stream
. Remember to include the access token in the stream's metadata.SLURequest
and all responses are of type SLUResponse
. These are envelopes that will contain different types of data, depending on the situation:
SLURequest.config
message, describing the audio streamSLURequest.event.START
message when the speech stream is startedSLUResponse.started
message, containing the audioContext
idSLURequest.audio
SLU
stream is bidirectional, it will receive data at the same time as it sends data. Refer to the docs to see the meaning of different types of SLUResponse
SLURequest.event.STOP
messageSLUResponse.finished
event, containing the audioContext
id that was finishedThe connection can be kept open, but an active speech stream (audioContext) will have a maximum duration of 5 minutes.
There are other APIs that can be used to manage Speechly applications. Instead of integrating to these, a quicker alternative is to use the Speechly command. Nevertheless, the APIs are documented and usable, if so required.
The Speechly API supports automatic transcoding for HTTP/1.1 REST access with JSON content. This means that gRPC services are also exposed as HTTP, being accessible and usable with any REST toolchain (curl, postman etc). The only exception to this is the SLU API, which is a bidirectional streaming API and cannot be represented in HTTP.
The transcoding is implemented in envoy filter and mostly use the default bindings. To call the IdentityAPI
, for example:
curl https://api.speechly.com/speechly.identity.v2.IdentityAPI/Login -d '{"deviceId": "$DEVICEID", "application": {"appId": "$APPID"}}'
and to call an API requiring authorization:
curl https://api.speechly.com/speechly.slu.v1.WLU/Text -H "Authorization: Bearer $TOKEN" -d '{"text": "show python repos"}'
The mapping for transcoding is implemented by generating the descriptor set file, which is located in this repository (speechly_api.pb). This file is also usable in grpcurl
to do intelligent type mapping for command line gRPC access.
See also Google's protobuf annotations for transcoding HTTP/JSON to gRPC.
The build is done with make
and docker
.
You can run the build for all languages with make build
from the root of this repo.
link |
Stars: 17 |
Last commit: 1 week ago |
tokenize
option under BatchOutput
by @langma in https://github.com/speechly/api/pull/79Full Changelog: https://github.com/speechly/api/compare/0.8.17...0.9.0
Swiftpack is being maintained by Petr Pavlik | @ptrpavlik | @swiftpackco | API | Analytics