Swiftpack.co - Package - PeetV/DataKit

DataKit

Build Status codecov Codacy Badge

DataKit is a Swift package for loading, manipulating and analysing datasets in memory. Functionality is built around DataSet and Column classes with a DataSet containing a collection of Column instances, together representing tabular data.

The DataSet class takes inspiration from DataFrame classes in R, Python Pandas and Julia DataFrames.jl libraries.

DataKit is developed to explore machine learning algorithms in Swift. See TreeKit for decision tree classification built on DataKit.

Some guiding principles for the design of the package API are:

  1. Explicitly deal with data types throughout. Don't make any assumptions about what type is intended.
  2. Handle missing data explicitly as Swift Optionals.
  3. Use Swift Generics for common functionality, for example through Generic column methods.

Installation

Install using the Swift package manager by adding a dependency in a project Package.swift file:

...
dependencies: [
    // Dependencies declare other packages that this package depends on.
    // .package(url: /* package url */, from: "1.0.0"),
    .package(url: "https://github.com/PeetV/DataKit.git", from: "0.4.0"),
],
targets: [
    // Targets are the basic building blocks of a package. A target can define a
    // module or a test suite.
    // Targets can depend on other targets in this package, and on products in
    // packages which this package depends on.
    .target(
        name: ...,
        dependencies: ["DataKit"]),
...

Examples

See the docs folder for interactive examples in playgrounds that can be run using Xcode or see the Contents.swift file in the playground folder.

| Playground | Description | |----------------------|-----------------------------------------------------------------------| | Overview.playground | General overview of the DataKit API. | | Reshaping.playground | Examples of reshaping data, for example reformatting column content. |

Roadmap

  • [x] 0.1 Integer, Double, String and Bool column type datasets
  • [x] 0.2 Subscripts
  • [x] 0.3 Date column type
  • [x] 0.4 Aggregate by
  • [ ] 0.5 Column functions inc. maths, string regex
  • [ ] 0.6 Join datasets

Github

link
Stars: 0
Help us keep the lights on

Dependencies

Used By

Total:

Releases

0.4.0 - Feb 21, 2019

Added aggregate(by: method.

0.30 - Feb 8, 2019

Date column type added, using string column as underlying data type, but enabling date functions on a column.

0.2.0 - Feb 2, 2019

Full suite of column and dataset subscripts

0.1.2 - Jan 22, 2019

Minor fixes

0.1.1 - Jan 21, 2019

Minor fix