Swiftpack.co - Package - Kitura/swift-html-entities

HTMLEntities

Build Status - Master macOS Linux Apache 2 codecov Carthage compatible

Summary

Pure Swift HTML encode/decode utility tool for Swift.

Includes support for HTML5 named character references. You can find the list of all 2231 HTML5 named character references here.

HTMLEntities can escape ALL non-ASCII characters as well as the characters <, >, &, ", , as these five characters are part of the HTML tag and HTML attribute syntaxes.

In addition, HTMLEntities can unescape encoded HTML text that contains decimal, hexadecimal, or HTML5 named character references.

API Documentation

API documentation for HTMLEntities is located here.

Features

  • Supports HTML5 named character references (NegativeMediumSpace; etc.)
  • HTML5 spec-compliant; strict parse mode recognizes parse errors
  • Supports decimal and hexadecimal escapes for all characters
  • Simple to use as functions are added by way of extending the default String class
  • Minimal dependencies; implementation is completely self-contained

Version Info

Latest release of HTMLEntities requires Swift 4.0 and higher.

Installation

Via Swift Package Manager

Add HTMLEntities to your Package.swift:

import PackageDescription

let package = Package(
  name: "<package-name>",
  ...
  dependencies: [
    .package(url: "https://github.com/Kitura/swift-html-entities.git", from: "3.0.0")
  ]
  // Also, make sure to add HTMLEntities to your package target's dependencies
)

Via CocoaPods

Add HTMLEntities to your Podfile:

target '<project-name>' do
  pod 'HTMLEntities', :git => 'https://github.com/Kitura/swift-html-entities.git'
end

Via Carthage

Add HTMLEntities to your Cartfile:

github "Kitura/swift-html-entities"

Usage

import HTMLEntities

// encode example
let html = "<script>alert(\"abc\")</script>"

print(html.htmlEscape())
// Prints "&#x3C;script&#x3E;alert(&#x22;abc&#x22;)&#x3C;/script&#x3E;"

// decode example
let htmlencoded = "&lt;script&gt;alert(&quot;abc&quot;)&lt;/script&gt;"

print(htmlencoded.htmlUnescape())
// Prints "<script>alert(\"abc\")</script>"

Advanced Options

HTMLEntities supports various options when escaping and unescaping HTML characters.

Escape Options

allowUnsafeSymbols

Defaults to false. Specifies if unsafe ASCII characters should be skipped or not.

import HTMLEntities

let html = "<p>\"café\"</p>"

print(html.htmlEscape())
// Prints "&#x3C;p&#x3E;&#x22;caf&#xE9;&#x22;&#x3C;/p&#x3E;"

print(html.htmlEscape(allowUnsafeSymbols: true))
// Prints "<p>\"caf&#xE9;\"</p>"

decimal

Defaults to false. Specifies if decimal character escapes should be used instead of hexadecimal character escapes whenever numeric character escape is used (i.e., does not affect named character references escapes). The use of hexadecimal character escapes is recommended.

import HTMLEntities

let text = "한, 한, ế, ế, 🇺🇸"

print(text.htmlEscape())
// Prints "&#x1112;&#x1161;&#x11AB;, &#xD55C;, &#x1EBF;, e&#x302;&#x301;, &#x1F1FA;&#x1F1F8;"

print(text.htmlEscape(decimal: true))
// Prints "&#4370;&#4449;&#4523;, &#54620;, &#7871;, e&#770;&#769;, &#127482;&#127480;"

encodeEverything

Defaults to false. Specifies if all characters should be escaped, even if some characters are safe. If true, overrides the setting for allowUnsafeSymbols.

import HTMLEntities

let text = "A quick brown fox jumps over the lazy dog"

print(text.htmlEscape())
// Prints "A quick brown fox jumps over the lazy dog"

print(text.htmlEscape(encodeEverything: true))
// Prints "&#x41;&#x20;&#x71;&#x75;&#x69;&#x63;&#x6B;&#x20;&#x62;&#x72;&#x6F;&#x77;&#x6E;&#x20;&#x66;&#x6F;&#x78;&#x20;&#x6A;&#x75;&#x6D;&#x70;&#x73;&#x20;&#x6F;&#x76;&#x65;&#x72;&#x20;&#x74;&#x68;&#x65;&#x20;&#x6C;&#x61;&#x7A;&#x79;&#x20;&#x64;&#x6F;&#x67;"

// `encodeEverything` overrides `allowUnsafeSymbols`
print(text.htmlEscape(allowUnsafeSymbols: true, encodeEverything: true))
// Prints "&#x41;&#x20;&#x71;&#x75;&#x69;&#x63;&#x6B;&#x20;&#x62;&#x72;&#x6F;&#x77;&#x6E;&#x20;&#x66;&#x6F;&#x78;&#x20;&#x6A;&#x75;&#x6D;&#x70;&#x73;&#x20;&#x6F;&#x76;&#x65;&#x72;&#x20;&#x74;&#x68;&#x65;&#x20;&#x6C;&#x61;&#x7A;&#x79;&#x20;&#x64;&#x6F;&#x67;"

useNamedReferences

Defaults to false. Specifies if named character references should be used whenever possible. Set to false to always use numeric character references, i.e., for compatibility with older browsers that do not recognize named character references.

import HTMLEntities

let html = "<script>alert(\"abc\")</script>"

print(html.htmlEscape())
// Prints “&#x3C;script&#x3E;alert(&#x22;abc&#x22;)&#x3C;/script&#x3E;”

print(html.htmlEscape(useNamedReferences: true))
// Prints “&lt;script&gt;alert(&quot;abc&quot;)&lt;/script&gt;”

Set Escape Options Globally

HTML escape options can be set globally so that you don't have to set them everytime you want to escape a string. The options are managed in the String.HTMLEscapeOptions struct.

import HTMLEntities

// set `useNamedReferences` to `true` globally
String.HTMLEscapeOptions.useNamedReferences = true

let html = "<script>alert(\"abc\")</script>"

// Now, the default behavior of `htmlEscape()` is to use named character references
print(html.htmlEscape())
// Prints “&lt;script&gt;alert(&quot;abc&quot;)&lt;/script&gt;”

// And you can still go back to using numeric character references only
print(html.htmlEscape(useNamedReferences: false))
// Prints "&#x3C;script&#x3E;alert(&#x22;abc&#x22;)&#x3C;/script&#x3E;"

Unescape Options

strict

Defaults to false. Specifies if HTML5 parse errors should be thrown or simply passed over.

Note: htmlUnescape() is a throwing function if strict is used in call argument (no matter if it is set to true or false); htmlUnescape() is NOT a throwing function if no argument is provided.

import HTMLEntities

let text = "&#4370&#4449&#4523"

print(text.htmlUnescape())
// Prints "한"

print(try text.htmlUnescape(strict: true))
// Throws a `ParseError.MissingSemicolon` instance

// a throwing function because `strict` is passed in argument
// but no error is thrown because `strict: false`
print(try text.htmlUnescape(strict: false))
// Prints "한"

Acknowledgments

HTMLEntities was designed to support some of the same options as he, a popular Javascript HTML encoder/decoder.

License

Apache 2.0

Github

link
Stars: 122

Dependencies

Used By

Total: 0

Releases

3.0.14 - 2019-10-27 20:19:08

  • Don't depend on Foundation since Swift 4.2 (#58). Credit: @broadwaylamb

3.0.13 - 2019-04-29 19:24:43

  • Improve compile time in release mode on Swift 5 (#51)

3.0.12 - 2019-04-11 09:49:59

  • Build in Swift 5 mode (#54)

3.0.11 - 2019-03-26 11:50:40

  • Adds Swift 5 compatibility (#46, #47, #49) Note that Swift 3.1 is no longer supported.

3.0.10 - 2018-03-29 17:23:26

What's New

  • Remove characters deprecations

3.0.9 - 2017-09-14 16:03:47

What's New

  • Add CFBundleVersion to xcodeproj so apps can be submitted to iTunes Connect/TestFlight; #37

3.0.8 - 2017-09-13 18:29:31

What's New

  • Add support for Swift 4

3.0.7 - 2017-09-08 17:12:16

What is new

  • Set deployment target OS versions to the minimum allowed (for Carthage builds)

3.0.6 - 2017-09-06 14:38:56

What is new

  • Add xcodeproj so that running carthage update will build the code into frameworks

3.0.5 - 2017-08-29 21:35:58

What is new

  • Fix minor bug: #29

3.0.4 - 2017-07-31 16:15:36

What is new

  • Added support for CocoaPods. See README for Podfile instructions.

3.0.3 - 2017-04-24 19:23:52

What is new

  • Added Swift 3.1.1 support
  • Added new test case to enforce equal number of tests on OSX and Linux; randomize test order

3.0.2 - 2017-03-28 19:17:26

What is new

  • Added Swift 3.1 support
  • Added new Travis matrix build to test for backwards compatibility with Swift 3.0.2

3.0.1 - 2016-12-19 16:13:13

What is new

  • Added Swift 3.0.2 support
  • Added docs generated via Jazzy

3.0.0 - 2016-12-07 19:51:47

Major update adding support for more HTML escape options

What's new

  • Support more HTML escape options (allowUnsafeSymbols, encodeEverything)
  • Support global HTML escape option overrides
  • Change default value of useNamedReferences parameter from true to false
  • Ignore non-printing ASCII characters (DEL, TAB, etc.) when escaping

This update is tagged as 3.0.0 because it contains breaking changes since 2.0.1. Make sure to test your code again after updating to 3.0.0.

2.0.1 - 2016-11-09 17:45:48

What is new

  • Supports Swift 3.0.1; backwards compatible with Swift 3.0

2.0.0 - 2016-10-17 22:00:46

Major update adding HTML5 support

What is new

  • Supports HTML5 named character references (NegativeMediumSpace; etc.)
  • HTML5 spec-compliant; strict parse mode recognizes parse errors
  • Decode HTML in non-strict mode by default (previously it was strict mode by default)

This update is tagged as 2.0.0 because it contains breaking changes since 1.0.2. Make sure to test your code again after updating to 2.0.0.

1.0.2 - 2016-09-27 20:41:59

Bug fixes

  • Fixed bug when parsing special numeric characters (i.e., &#x80; should be decoded as 0x20AC, aka the Euro sign); full table of special numeric characters here
  • Fixed bug when parsing invalid unicode ranges (0xD800 to 0xDFFF inclusive, or greater than 0x10FFFF); now these numeric character references will all be decoded as the 0xFFFD replacement character

1.0.1 - 2016-09-23 15:44:28

Bug fixes

  • Fixed bug when unescaping named character references that contain numbers (i.e., &frac34;)
  • Fixed bug on Linux when unescaping strings that only contain empty-string-equivalent characters (i.e., "\u{200C}" == "" is true on Linux but false on macOS)

1.0.0 - 2016-09-22 16:49:08

HTMLEntities 1.0

Welcome to the first release of HTMLEntities, a pure Swift tool for HTML escaping/unescaping.

Features

  • Supports HTML4 named character references (nbsp, cent, etc.)
  • Supports decimal and hexadecimal escapes for non-named characters
  • Simple to use as functions are added by way of extending the default String class
  • Minimal dependencies; implementation is completely self-contained

Swift Version

HTMLEntities 1.0 runs on Swift 3.0, on both macOS and Ubuntu Linux.