This library is a complete port of python_speech_features in pure Kotlin available for Android and iOS projects.
It provides common speech features for Automated speech recognition (ASR) including MFCCs and filterbank energies.
To know more about MFCCs read more.
We support multiple platforms using Kotlin multiplatform.
Add jitpack.io to your project's repositories:
allProjects {
repositories {
google()
maven { url 'https://jitpack.io' }
}
}
Add the dependency:
dependencies {
implementation "com.github.MerlynMind:kotlin_speech_features:${version}"
}
A sample app is included in this repo to help understand the implementation.
private val speechFeatures = SpeechFeatures()
val result = speechFeatures.mfcc(MathUtils.normalize(wav), nFilt = 64)
val result = speechFeatures.fbank(MathUtils.normalize(wav), nFilt = 64)
val result = speechFeatures.logfbank(MathUtils.normalize(wav), nFilt = 64)
val result = speechFeatures.ssc(MathUtils.normalize(wav), nFilt = 64)
File > Add Packages...
Add Package
buttonA sample app is included in this repo to help understand the implementation.
KotlinIntArray
and normalize it.
import KotlinSpeechFeatures
let signal = [Int](https://raw.github.com/MerlynMind/kotlin_speech_features/main/1...1000) // Example signal
let normalized = MathUtils.Companion.init().normalize(sig: toKotlinIntArray(arr: signal))
func toKotlinIntArray(arr: [Int]) -> KotlinIntArray {
let result = KotlinIntArray(size: Int32(arr.capacity))
for i in 0...(arr.count-1) {
result.set(index: Int32(i), value: Int32(arr[i]))
}
return result
}
let speechFeatures = SpeechFeatures()
let result = speechFeatures.mfcc(signal: normalized, sampleRate: 16000, winLen: 0.025, winStep: 0.01, numCep: 13, nFilt: 64, nfft: 512, lowFreq: 0, highFreq: ni;, preemph: 0.97, ceplifter: 22, appendEnergy: true, winFunc: nil)
let result = speechFeatures.fbank(signal: normalized, sampleRate: 16000, winLen: 0.025, winStep: 0.01, nFilt: 64, nfft: 512, lowFreq: 0, highFreq: nil, preemph: 0.97, winFunc: nil)
let result = speechFeatures.logfbank(signal: normalized, sampleRate: 16000, winLen: 0.025, winStep: 0.01, nFilt: 64, nfft: 512, lowFreq: 0, highFreq: nil, preemph: 0.97, winFunc: nil)
let result = speechFeatures.ssc(signal: normalized, sampleRate: 16000, winLen: 0.025, winStep: 0.01, nFilt: 64, nfft: 512, lowFreq: 0, highFreq: nil, preemph: 0.97, winFunc: nil)
Coming soon...
Interested in contributing to the library? Thank you so much for your interest! We are always looking for improvements to the project and contributions from open-source developers are greatly appreciated.
git checkout https://github.com/merlynmind/kotlin_speech_features -b name_for_new_branch
If you want to say thank you and/or support active development of this library:
Thank you so much for your interest in growing the reach of our library!
wget http://voyager.jpl.nasa.gov/spacecraft/audio/english.au
sox english.au -e signed-integer english.wav
link |
Stars: 20 |
Last commit: 1 year ago |
Swiftpack is being maintained by Petr Pavlik | @ptrpavlik | @swiftpackco | API | Analytics