Swiftpack.co - Package - tsolomko/SWCompression

SWCompression

Swift 4.2 GitHub license Build Status

A framework with (de)compression algorithms and functions for processing various archives and containers.

What is this?

SWCompression — is a framework with a collection of functions for:

  1. Decompression (and sometimes compression) using different algorithms.
  2. Reading (and sometimes writing) archives of different formats.
  3. Reading (and sometimes writing) containers such as ZIP, TAR and 7-Zip.

It also works both on Apple platforms and Linux.

All features are listed in the tables below. "TBD" means that feature is planned but not implemented (yet).

| | Deflate | BZip2 | LZMA/LZMA2 | | ------------- | ------- | ----- | ---------- | | Decompression | ✅ | ✅ | ✅ | | Compression | ✅ | ✅ | TBD |

| | Zlib | GZip | XZ | ZIP | TAR | 7-Zip | | ----- | ---- | ---- | --- | --- | --- | ----- | | Read | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | Write | ✅ | ✅ | TBD | TBD | ✅ | TBD |

Also, SWCompression is written with Swift only.

Installation

SWCompression can be integrated into your project using Swift Package Manager, CocoaPods or Carthage.

Swift Package Manager

Add SWCompression to you package dependencies and specify it as a dependency for your target, e.g.:

import PackageDescription

let package = Package(
    name: "PackageName",
    dependencies: [
        .package(url: "https://github.com/tsolomko/SWCompression.git",
                 from: "4.5.0")
    ],
    targets: [
        .target(
            name: "TargetName",
            dependencies: ["SWCompression"]
        )
    ]
)

More details you can find in Swift Package Manager's Documentation.

CocoaPods

Add pod 'SWCompression', '~> 4.5' and use_frameworks! to your Podfile.

To complete installation, run pod install.

If you need only some parts of framework, you can install only them using sub-podspecs. Available subspecs:

  • SWCompression/BZip2
  • SWCompression/Deflate
  • SWCompression/Gzip
  • SWCompression/LZMA
  • SWCompression/LZMA2
  • SWCompression/SevenZip
  • SWCompression/TAR
  • SWCompression/XZ
  • SWCompression/Zlib
  • SWCompression/ZIP

"Optional Dependencies"

For both ZIP and 7-Zip there is a most commonly used compression method. This is Deflate for ZIP and LZMA/LZMA2 for 7-Zip. Thus, SWCompression/ZIP subspec has SWCompression/Deflate subspec as a dependency and SWCompression/LZMA subspec is a dependency for SWCompression/SevenZip.

But both of these formats support other compression methods as well, and some of them are implemented in SWCompression. For CocoaPods configurations there are some sort of 'optional dependencies' for such compression methods.

"Optional dependency" in this context means that SWCompression/ZIP or SWCompression/7-Zip will support particular compression methods only if a corresponding subspec is expicitly specified in your Podfile and installed.

List of "optional dependecies":

  • For SWCompression/ZIP:
    • SWCompression/BZip2
    • SWCompression/LZMA
  • For SWCompression/SevenZip:
    • SWCompression/BZip2
    • SWCompression/Deflate

Note: If you use Swift Package Manager or Carthage you always have everything (ZIP and 7-Zip are built with Deflate, BZip2 and LZMA/LZMA2 support).

Carthage

Add to your Cartfile github "tsolomko/SWCompression" ~> 4.5.

Then run carthage update.

Finally, drag and drop SWCompression.framework from Carthage/Build folder into the "Embedded Binaries" section on your targets' "General" tab in Xcode.

SWCompression uses BitByteData framework, so Carthage will also download it, and you should drag and drop BitByteData.framework file into the "Embedded Binaries" as well.

Usage

Basic Example

If you'd like to decompress "deflated" data just use:

// let data = <Your compressed data>
let decompressedData = try? Deflate.decompress(data: data)

However, it is unlikely that you will encounter deflated data outside of any archive. So, in case of GZip archive you should use:

let decompressedData = try? GzipArchive.unarchive(archive: data)

Handling Errors

Most SWCompression functions can throw an error and you are responsible for handling them. If you look at list of available error types and their cases, you may be frightened by their number. However, most of these cases (such as XZError.wrongMagic) exist for diagnostic purposes.

Thus, you only need to handle the most common type of error for your archive/algorithm. For example:

do {
    // let data = <Your compressed data>
    let decompressedData = try XZArchive.unarchive(archive: data)
} catch let error as XZError {
    <handle XZ related error here>
} catch let error {
    <handle all other errors here>
}

Or, if you don't care about errors at all, use try?.

Documentation

Every function or type of SWCompression's public API is documented. This documentation can be found at its own website.

Sophisticated example

There is a small command-line program, "swcomp", which is included in this repository in "Sources/swcomp". To build it you need to uncomment several lines in "Package.swift" and run swift build -c release.

Contributing

Whether you find a bug, have a suggestion, idea or something else, please create an issue on GitHub.

In case you have encoutered a bug, it would be especially helpful if you attach a file (archive, etc.) that caused the bug to happen.

If you'd like to contribute code, please create a pull request on GitHub.

Note: If you are considering working on SWCompression, please note that Xcode project (SWCompression.xcodeproj) was created manually and you shouldn't use swift package generate-xcodeproj command.

Executing tests locally

If you'd like to run tests on your computer, you need to do an additional step after cloning this repository:

git submodule update --init --recursive

This command downloads files which are used for testing. These files are stored in a separate repository. Git LFS is used for storing them which is the reason for having them in the separate repository, since Swift Package Manager have some problems with Git LFS-enabled repositories (installing git-lfs locally with --skip-smudge option is required to solve these problems).

Note: You can also use "Utils/prepare-workspace-macos.sh" script from the repository, which not only downloads test files but also downloads dependencies.

Performance

Usage of whole module optimizations is recommended for best performance. These optimizations are enabled by default for Release configurations.

Tests Results document contains results of benchmarking of various functions.

Why?

First of all, existing solutions for work with compression, archives and containers have certain disadvantages. They might not support a particular compression algorithm or archive format and they all have different APIs, which sometimes can be slightly confusing for users. This project attempts to provide missing (and sometimes existing) functionality through unified API which is easy to use and remember.

Secondly, it may be important to have a compression framework written completely in Swift, without relying on either system libraries or solutions implemented in different languages. Additionaly, since SWCompression is written fully in Swift without Objective-C, it can also be used on Linux.

Future plans

See 5.0 Update Project for the list of planned API changes and new features.

  • Performance...
  • Better Deflate compression.
  • Something else...

Support Financially

If you would like to support this project or me financially you can do so via PayPal using this link.

License

MIT licensed

References

Github

link
Stars: 62
Help us keep the lights on

Dependencies

Releases

4.5.2 - Jun 8, 2019

  • Increased the lowest required version of BitByteData dependency to 1.4.1.

Comment: This version of BBD fixes its gross incompatibility with Swift 5.

4.5.1 - Mar 13, 2019

  • Minimum required version of BitByteData is now 1.4.0.
  • Updated to support Swift 4.2.
  • Added default values to the properties of LZMAProperties struct based on the documentation from LZMA SDK.
  • Added init() to LZMAProperties struct which sets lc, lp, pb, and dictionarySize properties to their default values.
  • Improved the detection of Swift versions less than 4.2 in the workaround for the crash in Data.prefix(upTo:).
  • Documentation updates:
    • Fixed an outdated example in README (PR #4 by @brianantonelli).
    • Fixed grammar issues related to the usage of articles, during/while, and others.
  • swcomp changes:
    • Minimum required version of SwiftCLI is now 5.2.0.
    • Improved the layout of output of benchmark commands.

4.5.0 - Sep 11, 2018

  • Added LZMAProperties struct with simple member-wise initializer.
  • Added LZMA.decompress(data:properties:uncompressedSize:) function (with uncompressedSize argument being optional) which allows to specify LZMA properties.
    • Useful in situations when properties are known from some external source instead of being encoded at the beginning of data.
    • Note, that these new APIs are intended to be used by expert users and as such no validation is performed on LZMA properties values.
  • Added support for Delta "filter" in both XZ archives and 7-Zip containers.
  • Added support for SHA-256 check type in XZ archives.
    • As a result XZError.checkTypeSHA256 is now never thrown and will be removed in the next major update.
  • Added ZipEntryInfo.crc property.
  • Fixed a problem where XZArchive.unarchive and XZArchive.splitUnarchive functions would produce incorrect result when more than one "filter" was used (though it was practically impossible to encounter this issue since only one filter was supported (LZMA2) until this update).
  • Reduced in-memory size of ZipEntryInfo instances.
    • Some rough estimates indicate that the reduction is up to 68%.
  • Clarified documentation for LZMA.decompress(data:) to explain expectation about data argument.
    • Particularly, it is explained that it expects LZMA properties encoded with standard LZMA encoding scheme at the beginning of data.
  • swcomp changes:
    • zip -i command now also prints CRC32 for all entries.
    • -v is now accepted as an alias for --verbose option.

4.5.0-test - Sep 6, 2018

This is the first and only test release for the upcoming 4.5.0 update. It includes new LZMAProperties APIs, support for SHA-256 check for XZ archives and support for delta filter in 7-ZIP and XZ, as well as a couple of fixes.

Known issue: no documentation for new APIs.

4.4.0 - Aug 9, 2018

A couple of side notes before diving into release notes:

  1. I've started a github project board where I am going to track and plan changes and additions for 5.0 Update.
  2. If you ever wanted to financially support either this project or me you can now do so using this link.

Creating TAR containers

The main addition in this update is a set of APIs which allow to create a new TAR container.

  • Added TarContainer.create(from:) function which creates a new TAR container with provided array of TarEntry objects as its content and generates container's Data.
  • Added TarCreateError error type with a single case utf8NonEncodable. Comment: This enum is planned to be merged with TarError in 5.0. A new enum had to be created since otherwise it would be a breaking change to introduce a new case to already existing enum.

To enable reasonable usage scenarios for these new APIs, additional changes to existing APIs have been made:

  • TarEntry.info and TarEntry.data are now var-properties (instead of let).
  • Accessing setter of TarEntry.data now automatically updates TarEntry.info.size with data.count value (or 0 if data is nil). Comment: Maintaining consistency between these two properties is extremely important for producing correct and valid containers.
  • Added (or, rather, made public) TarEntry.init(info:data:) initializer.
  • Most public properties of TarEntryInfo are now var-properties (instead of let). Exceptions: size and type. Comment: Property size is kept read-only for reasons mentioned above. The reason for not allowing mutating type property is more vague: it is hard to imagine usage scenario where changing the type of an entry makes sense. Moreover, there are some concerns about (potential future) behavior in more generic context with type-erased ContainerEntryInfo objects, etc.
  • Added TarEntryInfo.init(name:type:) initializer.

I do realize that this set of APIs is somewhat limited. For example, it is not easy to convert ZIP container (array of ZipEntry objects) to TAR using these new additions. But rest assured, there are plans to provide more generic functionality for creating new containers in the future (something like TarContainer.create(from entries: [ContainerEntry]) throws -> Data).

Other Changes

  • Improved compatibility with other TAR implementations:
    • All string fields of TAR headers are now treated as UTF-8 strings. Comment: This is compatible with previous behavior since ASCII strings are UTF-8 strings.
    • Non-well-formed numeric fields of TAR headers no longer cause TarError.wrongField to be thrown and instead result in nil values of corresponding properties of TarEntryInfo (exception: size field). Comment: Mainly, this change was made to accommodate situations when a TAR header field is absent (i.e. filled with NULLs). Absent size field is still not accepted since its value impacts the structure of the container. This particular behavior is consistent with other implementations.
    • Base-256 encoding of numeric fields, which is sometimes used for very big or negative values, is now supported.
    • Leading NULLs and whitespaces in numeric fields are now correctly skipped.
    • Sun Extended Headers are now processed as local PAX extended headers instead of being considered entries with .unknown type.
    • GNU TAR format features for incremental backups are now partially supported (access and creation time).
  • TarContainer.formatOf now correctly returns TarFormat.gnu when GNU format "magic" field is encountered.
  • A new (copy) Data object is now created for TarEntry.data property instead of using a slice of input container data. Comment: This change makes indices of TarEntry.data zero-based which is consistent with other containers. This should also prevent keeping in memory Data for the entire container until the TarEntry object is destroyed.
  • Fixed incorrect file name of TAR entries from containers with GNU TAR format-specific features being used.
  • Fixed TarError.wrongPaxHeaderEntry error being thrown when header with multi-byte UTF-8 characters is encountered.
  • Fixed incorrect values of TarEntryInfo.ownerID, groupID, deviceMajorNumber and deviceMinorNumber properties (previously, they were assumed to be encoded as decimal numbers).
  • Slightly improved performance of LZMA/LZMA2 operations by making internal classes declared as final.
  • swcomp changes:
    • Added -c, --create option to tar command which creates a new TAR container.
    • Output of bencmark commands is now properly flushed on non-Linux platforms.
    • Results for omitted iterations of benchmark commands are now also printed.
    • Iteration number in benchmark commands is now printed with leading zeros.
    • Fixed compilation error on Linux platforms due to ObjCBool no longer being an alias for Bool.