Swiftpack.co -  ctxppc/PatternKit as Swift Package
Swiftpack.co is a collection of thousands of indexed Swift packages. Search packages.
ctxppc/PatternKit
A Swift library for writing & manipulating patterns for matching on collections.
.package(url: "https://github.com/ctxppc/PatternKit.git", from: "0.3.0")

The PatternKit library

PatternKit is a Swift-based nondeterministic automaton parser and regular expression engine.

  • Fully designed for and written in Swift
  • Supports arbitrary Sequences with Equatable elements (String, [Int], and so on)
  • Batteries included — including full support for matching over Unicode extended grapheme clusters
  • Type-safe — syntax checked at build time by the compiler
  • No parsing/building overhead
  • Value semantics
  • Extensible architecture
  • Backreferences
  • Backtracking
  • Shorthand and longhand notation

PatternKit is still a pre-1.0 project — please use with care. :-)

Usage

(Author's note: Some examples in this document might become outdated as PatternKit evolves.)

The following matches very simple e-mail addresses.

import PatternKit

let alphanumeric = ("A"..."Z") | ("a"..."z") | ("0"..."9")
let part = alphanumeric+ • "." • alphanumeric+

let userToken = Token(part)
let hostToken = Token(part)

let emailAddress = userToken • ("@" as Literal) • hostToken

if let match = emailAddress.matches(over: "johnny.appleseed@example.com").first {
    let userPart = match.captures(for: userToken)[0]
    let hostPart = match.captures(for: hostToken)[0]
    print("User \(userPart) at \(hostPart)")		// Prints "User johnny.appleseed at example.com"
}

Note that the Match.captures(for:) method returns an array of captured strings instead of a single string. Like in other regex engines, a token can be used within a repeating pattern. However, unlike most engines, every new capture by that token is appended to the array instead of discarded.

The following matches the request line of a simple HTTP request.

import PatternKit

let verb = "GET" | "POST" | "PUT" | "HEAD" | "DELETE" | "OPTIONS" | "TRACE"
let version = "0"..."9" • ("." • "0"..."9")+

let pathComponent = "A"..."Z" | "a"..."z" | "0"..."9" | "_" | "+"
let path = ("/" • pathComponent*)+
let pathToken = Token(path)

let requestLine = Anchor.leading • "GET" • CharacterSet.whitespace+ • pathToken • "HTTP/" • version • "\x10\x13"

The following matches subsequences of three or more 6s, followed by two rolls greater than or equal to 3, and followed by three or more 6s again.

let snakes = 6.repeated(min: 3)
let pattern = snakes • (3...6).repeated(exactly: 2) • snakes
let diceRolls = [1, 5, 2, 6, 6, 6, 4, 4, 6, 6, 6, 4, 3, 6, 6, 6, 3, 5, 6, 6, 6, 6]

for match in pattern.matches(in: diceRolls) {
    let diceRollsNumber = match.startIndex + 1
    let foundPattern = match.subsequence.joined(separator: " ,")
    print("Pattern \(foundPattern) found at roll (diceRollsNumber)")
}

// Prints "Pattern 6, 6, 6, 4, 4, 6, 6, 6 found at roll 4" and "Pattern 6, 6, 6, 3, 5, 6, 6, 6, 6 found at roll 14"

The following matches any subsequence of a's, followed by b's, and followed by the same number of a's.

let firstPart = Token("a"+)
let secondPart = "b"+
let thirdPart = Referencing(firstPart)
let pattern = firstPart • secondPart • thirdPart

print(pattern.hasMatches(in: "aaaabbbaaaa"))		// Prints "true"

PatternKit supports all target Sequences with Equatable elements. Non-Comparable elements which are Equatable are also supported, but without support of ranges (... and ..<).

Available patterns & extensibility

PatternKit is designed from the ground up to be extensible with more patterns. Patterns are values that conform to the Pattern protocol. PatternKit provides the following pattern types, and clients can implement more themselves as needed.

  • A literal pattern (Literal) matches an exact sequence. PatternKit includes literal-convertible conformances so that string, character, and integer literals can be used directly within pattern expressions, without having to resort to the Literal initialiser.

  • A repeating pattern (EagerlyRepeating and LazilyRepeating) matches a subpattern consecutively, with an optional minimum and maximum. PatternKit extends Equatable and Pattern with the repeated(min:), repeated(min:max:), and repeated(exactly:) methods as well as the *, +, and ¿ postfix operators which form a repeating pattern on the element, sequence, or subpattern it's called on.

  • An anchor (Anchor) matches a boundary or position, such as a word boundary or the beginning of the sequence. By default, pattern matching begins and ends anywhere in a sequence, so leading and trailing anchors are required if a pattern must match the whole sequence at once.

  • A token (Token) matches what its subpattern matches, but also captures it. The captured subsequences can be retrieved using match.captures(for: token).

  • A referencing pattern (Referencing) matches a subsequence previously captured by a token.

In addition, PatternKit adds conformance to Swift's range types and Foundation's CharacterSet so that they can be used readily in patterns.

GitHub

link
Stars: 9
Last commit: 3 weeks ago

Ad: Job Offers

iOS Software Engineer @ Perry Street Software
Perry Street Software is Jack’d and SCRUFF. We are two of the world’s largest gay, bi, trans and queer social dating apps on iOS and Android. Our brands reach more than 20 million members worldwide so members can connect, meet and express themselves on a platform that prioritizes privacy and security. We invest heavily into SwiftUI and using Swift Packages to modularize the codebase.

Dependencies

Related Packages

Swiftpack is being maintained by Petr Pavlik | @ptrpavlik | @swiftpackco | API