Swiftpack.co - Package - MaxDesiatov/CoreXLSX

CoreXLSX

Excel spreadsheet (XLSX) format parser written in pure Swift

Build Status Version License Platform Coverage

CoreXLSX is a library focused on representing the low-level structure of XML-based XLSX spreadsheet format. It allows you to open a spreadsheet archive and map its XML structure into model types expressed directly in Swift.

Example

To run the example project, clone the repo, and run pod install from the Example directory first.

Model types in CoreXLSX directly map internal structure of XLSX format with more sensible naming applied to a few attributes. The API is pretty simple:

import CoreXLSX

guard let file = XLSXFile(filepath: "./categories.xlsx") else {
  fatalError("XLSX file corrupted or does not exist")
}

for path in try file.parseWorksheetPaths() {
  let ws = try file.parseWorksheet(at: path)
  for row in ws.data?.rows ?? [] {
    for c in row.cells {
      print(c)
    }
  }
}

This prints every cell from every worksheet in the given XLSX file. Please refer to the Worksheet model for more atttributes you might need to read from a parsed file.

Shared strings

Some cells (usually with strings) have their values shared in a separate model type, which you can get by evaluating try file.parseSharedString(). You can refer to the SharedStrings model for the full list of its properties.

Here's how you can get all shared strings in column "C" for example:

let sharedStrings = try file.parseSharedStrings()
let columnCStrings = ws.cells(atColumns: [ColumnReference("C")!])
  .filter { $0.type == "s" }
  .compactMap { $0.value }
  .compactMap { Int($0) }
  .compactMap { sharedStrings.items[$0].text }

Styles

Since version 0.5.0 you can parse style information from the archive with the new parseStyles() function. Please refer to the Styles model for more details. You should also note that not all XLSX files contain style information, so you should be prepared to handle the errors thrown from parseStyles() function in that case.

Here's a short example that fetches a list of fonts used:

let styles = try file.parseStyles()
let fonts = styles.fonts?.items.compactMap { $0.name?.value }

Reporting compatibility issues

If you stumble upon a file that can't be parsed, please file an issue posting the exact error message. Thanks to use of standard Swift Codable protocol, detailed errors are generated listing a missing attribute, so it can be easily added to the model enabling broader format support. Attaching a file that can't be parsed would also greatly help in diagnosing issues. If these files contain any sensitive data, we suggest obfuscating or generating fake data with same tools that generated original files, assuming the issue can still be reproduced this way.

If the whole file can't be attached, try passing a sufficiently large value (between 10 and 20 usually works well) to errorContextLength argument of XLSXFile initializer. This will bundle the failing XML snippet with the debug description of thrown errors. Please also attach the full debug description if possible when reporting issues.

How does it work?

Since every XLSX file is a zip archive of XML files, CoreXLSX uses XMLCoder library and standard Codable protocols to map XML nodes and atrributes into plain Swift structs. ZIPFoundation is used for in-memory decompression of zip archives. A detailed description is available here.

Requirements

  • Xcode 10.0 or later
  • Swift 4.2 or later
  • iOS 9.0 / watchOS 2.0 / tvOS 9.0 / macOS 10.11 or later deployment targets

Installation

Swift Package Manager

Swift Package Manager is a tool for managing the distribution of Swift code. It’s integrated with the Swift build system to automate the process of downloading, compiling, and linking dependencies.

Once you have your Swift package set up, adding CoreXLSX as a dependency is as easy as adding it to the dependencies value of your Package.swift.

dependencies: [
  .package(url: "https://github.com/MaxDesiatov/CoreXLSX.git",
           .upToNextMajor(from: "0.7.0"))
]

CocoaPods

CoreXLSX is available through CocoaPods. To install it, simply add pod 'CoreXLSX', '~> 0.7.0' to your Podfile like shown here:

source 'https://github.com/CocoaPods/Specs.git'
# Uncomment the next line to define a global platform for your project
# platform :ios, '9.0'
use_frameworks!
target '<Your Target Name>' do
  pod 'CoreXLSX', '~> 0.7.0'
end

Carthage

Carthage is a dependency manager that builds your dependencies and provides you with binary frameworks.

Carthage can be installed with Homebrew using the following command:

$ brew update
$ brew install carthage

Inside of your Cartfile, add GitHub path to CoreXLSX and its latest version:

github "MaxDesiatov/CoreXLSX" ~> 0.7.0

Then, run the following command to build the framework:

$ carthage update

Drag the built frameworks (including the subdependencies XMLCoder and ZIPFoundation) into your Xcode project.

Contributing

For development work and for running the tests in Xcode you need to run carthage bootstrap in the root directory of the cloned repository first. Then you can open the CoreXLSX.xcodeproj from the same directory and select the CoreXLSXmacOS scheme. This is the only scheme that has the tests set up, but you can also build any other scheme (e.g. CoreXLSXiOS) to make sure it builds on other platforms.

If you prefer not to work with Xcode, the project fully supports SwiftPM and the usual workflow with swift build and swift test should work, otherwise please report this as a bug.

Code of Conduct

This project adheres to the Contributor Covenant Code of Conduct. By participating, you are expected to uphold this code. Please report unacceptable behavior to corexlsx@desiatov.com.

Maintainers

Max Desiatov, Matvii Hodovaniuk

License

CoreXLSX is available under the Apache 2.0 license. See the LICENSE file for more info.

Github

link
Stars: 270
Help us keep the lights on

Used By

Total:

Releases

0.7.0 - May 25, 2019

Bugfix release that improves compatibility with different spreadsheet types.

Thanks to @grin for reporting and fixing issues in this release.

Breaking changes

All properties on struct Format except fontId and numberFormatId are now optional.

Additions

New borderId and fillId properties on struct Format.

Fixed bugs

  • Can't get cell string #58
  • Can't load basic spreadsheets created in Google Docs #64
  • fillId and borderId attributes missing from CoreXLSX.Format #65

Merged pull requests

0.6.1 - May 9, 2019

Bugfix release that adds case externalLink to Relationship.SchemaType improving .xlsx compatibility.

0.6.0 - May 2, 2019

This is a bugfix release with changes to the model API that improve compatibility with files containing formulas and varied shared strings formats.

Specifically:

  • new struct Formula added with a corresponding property on struct Cell
  • property color on struct Properties became optional
  • properties on struct RichText became optional
  • new chartsheet case added to enum Relationship
  • richText on struct SharedStrings became an array, not optional

Closed issues

  • Error Domain=NSCocoaErrorDomain Code=4865 "Expected String but found null instead." #59
  • Importing XLSX file #56
  • Error ParseCellContent #51
  • error parseWorksheet #50
  • Couldn't find end of Start Tag c #37

Merged pull requests

0.5.0 - Apr 18, 2019

This is a release with API additions and bug fixes.

This release of CoreXLSX can be integrated as a Swift 5 module if you're using Xcode 10.2, but support for Swift 4.2 and earlier Xcode 10 versions is also maintained.

Compatibility is improved for big files and files that internally contain namespaced XML. A few other previously reported compatibility issues are now fixed. Many thanks to everyone who reported the issues, the improvements in this release wouldn't be possible without your contribution!

Breaking changes

Several properties on the model types became optional when there's no guarantee they are always available in files generated by different apps and tools.

Additions

Now you can parse style information from the archive with the new parseStyles() function. Please refer to the Styles model for more details. Please note that not all XLSX files contain style information, so you should be prepared to handle the errors thrown from parseStyles() function in that case.

Merged pull requests

0.4.0 - Feb 7, 2019

This is a release with API improvements and bug fixes. A big thank you to everyone who provided bug reports and contributions that made this release possible!

Breaking changes

  • A few properties on the model types were added with cleaner names and better fitting types. Most of the old versions of those properties were kept as deprecated, but you might get some breakage with optionality, where we couldn't find a good deprecation path.

Additions

  • New parseSharedStrings function on XLSXFile allows you get values of cells with shared string value. Quite frequently those strings are unavailable and are only referenced in the original model types you get with parseWorksheet.

  • Previously when addressing cells and columns you had to use a stringly-typed API. It was also not very convenient for specifying a range of columns. This is now fixed with the new type-safe ColumnReference struct, which conforms to Comparable and Strideable.

  • Following the addition of an error context to XMLCoder, which is the main dependency of CoreXLSX, it is now exposed on struct XLSXFile. Pass a non-zero value to errorContextLength argument (default is 0) of XLSXFile initializer and you'll get a snippet of XML that failed to parse in the debug description of the error value.

  • Additional optional argument bufferSize was added to XLSXFile initializer as a response to previous reports about problems with zip file extraction. The default value is 10 MiB, which seems to be enough in most cases, but you can still try passing a larger value for bigger files if you see that an XML file stops abruptly in the middle of the file. Unfortunately, we haven't found a good way to adjust this value dynamically based on the file size, but please let us know if you did.

  • Support for Carthage was added as well as support for tvOS and watchOS.

Bugfixes

Some files that couldn't be previously parsed should now be handled better thanks to fixes in optionality and more properties added to the model types.

All changes