Swiftpack.co - MerlynMind/kotlin_speech_features as Swift Package

Swiftpack.co is a collection of thousands of indexed Swift packages. Search packages.
See all packages published by MerlynMind.
MerlynMind/kotlin_speech_features v1.0.0
This library provides common speech features for ASR including MFCCs and filterbank energies for Android and iOS.
โญ๏ธ 12
๐Ÿ•“ 2 weeks ago
.package(url: "https://github.com/MerlynMind/kotlin_speech_features.git", from: "v1.0.0")

Kotlin Speech Features

GitHub forks GitHub issues GitHub Stars


Quick Links

ย 

๐Ÿ“’ Introduction

This library is a complete port of python_speech_features in pure Kotlin available for Android and iOS projects.

It provides common speech features for Automated speech recognition (ASR) including MFCCs and filterbank energies.
To know more about MFCCs read more.

Features

๐Ÿ™‹ How to use

We support multiple platforms using Kotlin multiplatform.

Android

Integration

Add jitpack.io to your project's repositories:

allProjects {
  repositories {
    google()
    maven { url 'https://jitpack.io' }
  }
}

Add the dependency:

dependencies {
    implementation "com.github.MerlynMind:kotlin_speech_features:${version}"
}

Example implementation

A sample app is included in this repo to help understand the implementation.

  1. Convert your audio signal in the form of a float array. (A demo provided in the sample app)
  2. Initialize speech features
    private val speechFeatures = SpeechFeatures()
    
  3. Perform any of the 4 operations:
    val result = speechFeatures.mfcc(MathUtils.normalize(wav), nFilt = 64)
    val result = speechFeatures.fbank(MathUtils.normalize(wav), nFilt = 64)
    val result = speechFeatures.logfbank(MathUtils.normalize(wav), nFilt = 64)
    val result = speechFeatures.ssc(MathUtils.normalize(wav), nFilt = 64)
    
  4. The result will contain metrices with the expected features. Pass in these features for further processes (e.g. classification, speech recognition).

iOS

Integration

  1. In XCode, go to File > Add Packages...
  2. Paste in the URL of this repo in the search box
  3. Select the package found
  4. Click Add Package button

Example implementation

A sample app is included in this repo to help understand the implementation.

  1. Convert your audio signal in the form of an KotlinIntArray and normalize it.
    import KotlinSpeechFeatures
    
    let signal = [Int](https://raw.github.com/MerlynMind/kotlin_speech_features/main/1...1000) // Example signal
    let normalized = MathUtils.Companion.init().normalize(sig: toKotlinIntArray(arr: signal))
    
    func toKotlinIntArray(arr: [Int]) -> KotlinIntArray {
        let result = KotlinIntArray(size: Int32(arr.capacity))
        for i in 0...(arr.count-1) {
            result.set(index: Int32(i), value: Int32(arr[i]))
        }
        return result
    }
    
  2. Initialize speech features
    let speechFeatures = SpeechFeatures()
    
  3. Perform any of the 4 operations:
    let result = speechFeatures.mfcc(signal: normalized, sampleRate: 16000, winLen: 0.025, winStep: 0.01, numCep: 13, nFilt: 64, nfft: 512, lowFreq: 0, highFreq: ni;, preemph: 0.97, ceplifter: 22, appendEnergy: true, winFunc: nil)
    let result = speechFeatures.fbank(signal: normalized, sampleRate: 16000, winLen: 0.025, winStep: 0.01, nFilt: 64, nfft: 512, lowFreq: 0, highFreq: nil, preemph: 0.97, winFunc: nil)
    let result = speechFeatures.logfbank(signal: normalized, sampleRate: 16000, winLen: 0.025, winStep: 0.01, nFilt: 64, nfft: 512, lowFreq: 0, highFreq: nil, preemph: 0.97, winFunc: nil)
    let result = speechFeatures.ssc(signal: normalized, sampleRate: 16000, winLen: 0.025, winStep: 0.01, nFilt: 64, nfft: 512, lowFreq: 0, highFreq: nil, preemph: 0.97, winFunc: nil)
    
  4. The result will contain metrices with the expected features. Pass in these features for further processes (e.g. classification, speech recognition).
JavaScript
Coming soon...

โœ๏ธ Contributing

Interested in contributing to the library? Thank you so much for your interest! We are always looking for improvements to the project and contributions from open-source developers are greatly appreciated.

  1. Clone repo and create a new branch:
git checkout https://github.com/merlynmind/kotlin_speech_features -b name_for_new_branch
  1. Make changes and test
  2. Submit Pull Request with comprehensive description of changes

๐ŸŒŸ Spread the word!

If you want to say thank you and/or support active development of this library:

  • Add a GitHub Star to the project!
  • Tweet about the project on your Twitter! Tag @MerlynMind and/or #heyMerlnyn

Thank you so much for your interest in growing the reach of our library!

๐Ÿงก Credits

  • Arjun Sunil - Original Author of kotlin speech features
  • Raquib-Ul Alam - For major refactoring and making the code presentable
  • Rob Smith - For Mentoring and helping us to navigate through the task

๐Ÿ“ References

wget http://voyager.jpl.nasa.gov/spacecraft/audio/english.au
sox english.au -e signed-integer english.wav

GitHub

link
Stars: 12
Last commit: 2 weeks ago
jonrohan Something's broken? Yell at me @ptrpavlik. Praise and feedback (and money) is also welcome.

Release Notes

v1.0.0
2 weeks ago

Swiftpack is being maintained by Petr Pavlik | @ptrpavlik | @swiftpackco | API | Analytics