Swiftpack.co - musesum/MuPar as Swift Package

Swiftpack.co is a collection of thousands of indexed Swift packages. Search packages.
See all packages published by musesum.
musesum/MuPar 0.23.0201
micro parser
⭐️ 0
πŸ•“ 7 weeks ago
.package(url: "https://github.com/musesum/MuPar.git", from: "0.23.0201")

MuPar

MuPar is a simple parse graph for DSLs and NLP

  • DSLs (domain specific languages), like MuFlo
  • NLP (chat bots) with flexible word position and closures

with the following features

  • modified Backus Naur Form (BNF) to define a named parse tree
  • optional namespace { } brackets to restrict a sub-parse
  • allow runtime recompile of syntax for NLP / chat bots
  • somewhat idiomatic to Swift syntax

graph based intermediate representation

  • breaks graph loops when resolving namespace
  • allow future integration with data flow graphs
  • allow future bottom-up restructuring of parse tree

allow runtime closures to extend lexicon

  • search current Calendar, flight schedules, etc
  • integrate procedural code

allow imprecise searching

  • allow different word orders
  • based on minimal hops (hamming distance) from graph

allow short term memory (STM)

  • keep keywords from previous queries to complete imprecise matching
  • may be adjusted to 0 seconds for computer language parsing

Modified BNF

Here is the ubiquitous Hello World

greeting β‰ˆ "hello" "world"

namespace { } brackets limits the symbols hello and world to greeting.

greeting β‰ˆ hello world {
     hello β‰ˆ "hello"
     world β‰ˆ "world"
}

double quotes match strings, while single quotes match regular expressions:

year β‰ˆ '(19|20)[0-9][0-9]'
digits β‰ˆ '[0-9]{1, 5}'

Alternation and repetitions are supported

greetings β‰ˆ cough{, 3} (hello | yo+) (big | beautiful)* world?

Closures for Runtime APIs

in the file test.par is the line

events β‰ˆ 'event' eventList()

whereupon the source in TestNLP+test.swift, attaches to eventList()

root?.setMatch("test show event eventList()", eventListChecker)

and attaches a simple callback to extend the lexicon:

func eventListChecker(_ str: Substring) -> String? {
     let ret =  str.hasPrefix("yo") ? "yo" : nil
     return ret
}

which in the real world could attach to a dynamic calendar, or any other 3rd party API.

Here is the output from ParTests/TestNLP+Test.swift :

⟹ before attaching eventListChecker() - `yo` is unknown
"test show event yo" ⟹ 🚫 failed

⟹ runtime is attaching eventListChecker() callback to eventList()
"test show event eventList()"  ⟹  eventList.924 = (Function)

⟹ now `yo` is now matched during runtime
"test show event yo" ⟹  test: 0 show: 0 event: 0 yo: 0 ⟹ hops: 0 βœ”οΈŽ

Imprecise matching

For NLP, word order may not perfectly match parse tree order. So, report number of hops (or Hamming Distance) from ideal.

Output from ParTests/TestNLP+Test.swift:

"test event show yo" ⟹  test: 0 show: 1 event: 0 yo: 1 ⟹ hops: 2 βœ”οΈŽ
"yo test show event" ⟹  test: 1 show: 1 event: 2 yo: 2 ⟹ hops: 6 βœ”οΈŽ
"test show yo event" ⟹  test: 0 show: 0 event: 1 yo: 0 ⟹ hops: 1 βœ”οΈŽ
"test event yo show" ⟹  test: 0 show: 2 event: 0 yo: 0 ⟹ hops: 2 βœ”οΈŽ

Short term memory

For NLP, set a time where words from a previous query continue onto the next query.

Output from ParTests/TestNLP+Test.swift:

⟹ with no shortTermMemory, partial matches fail
"test show event yo" ⟹  test: 0 show: 0 event: 0 yo: 0 ⟹ hops: 0 βœ”οΈŽ
"test hide yo" ⟹ 🚫 failed
"test hide event" ⟹ 🚫 failed
"hide event" ⟹ 🚫 failed
"hide" ⟹ 🚫 failed

⟹ after setting ParRecents.shortTermMemory = 8 seconds
"test show event yo" ⟹  test: 0 show: 0 event: 0 yo: 0 ⟹ hops: 0 βœ”οΈŽ
"test hide yo" ⟹  test: 0 show: 10 event: 10 yo: 0 ⟹ hops: 20 βœ”οΈŽ
"test hide event" ⟹  test: 0 show: 10 event: 1 yo: 9 ⟹ hops: 20 βœ”οΈŽ
"hide event" ⟹  test: 10 show: 9 event: 0 yo: 8 ⟹ hops: 27 βœ”οΈŽ
"hide" ⟹  test: 9 show: 8 event: 8 yo: 9 ⟹ hops: 34 βœ”οΈŽ

Use Case

Here is the Par definition in the Par format:

par β‰ˆ name "β‰ˆ" right+ sub? end_ {
    name β‰ˆ '^[A-Za-z_]\w*'
    right β‰ˆ or_ | and_ | paren {
        or_ β‰ˆ and_ orAnd+ {
            orAnd β‰ˆ "|" and_
        }
        and_ β‰ˆ leaf reps? {
            leaf β‰ˆ match | path | quote | regex {
            match β‰ˆ '^([A-Za-z_]\w*)\(\)'
            path β‰ˆ '^[A-Za-z_][A-Za-z0-9_.]*'
            quote β‰ˆ '^\"([^\"]*)\"' // skip  \"
            regex β‰ˆ '^([i_]*\'[^\']+)'
            }
        }
        parens β‰ˆ "(" right ")" reps
    }
    sub β‰ˆ "{" end_ par "}" end_?
    end_ β‰ˆ '[ \\n\\t,]*'
    reps β‰ˆ '^([\~]?([\?\+\*]|\{],]?\d+[,]?\d*\})[\~]?)'
}

Here is a complete Par definition for the functional data flow graph, called Flo:

flo β‰ˆ left right* {

    left β‰ˆ (path | name)
    right β‰ˆ (hash | time | value | child | many | copyat | array | edges | embed | comment)+

    hash β‰ˆ "#" num
    time β‰ˆ "~" num
    child β‰ˆ "{" comment* flo+ "}" | "." flo+
    many β‰ˆ "." "{" flo+ "}"
    array β‰ˆ "[" thru "]"
    copyat β‰ˆ "@" (path | name) ("," (path | name))*

    value β‰ˆ scalar | exprs
    value1 β‰ˆ scalar1 | exprs

    scalar β‰ˆ "(" scalar1 ")"
    scalars β‰ˆ "(" scalar1 ("," scalar1)* ")"
    scalar1 β‰ˆ (thru | modu | data | num) {
        thru β‰ˆ num ("..." | "…") num dflt? now?
        modu β‰ˆ "%" num dflt? now?
        index β‰ˆ "[" (name | num) "]"
        data β‰ˆ "*"
        dflt β‰ˆ "=" num
        now β‰ˆ ":" num
    }
    exprs β‰ˆ "(" expr+ ("," expr+)* ")" {
        expr β‰ˆ (exprOp | name | scalars | scalar1 | quote)
        exprOp β‰ˆ '^(<=|>=|==|<|>|\*|_\/|\/|\%|\:|in|\,)|(\+)|(\-)[ ]'
    }
    edges β‰ˆ edgeOp (edgePar | exprs | edgeItem) comment* {

        edgeOp β‰ˆ '^([<←][<[email protected]βŸβŸ‘β—‡β†’>]+|[[email protected]βŸβŸ‘β—‡β†’>]+[>β†’])'
        edgePar β‰ˆ "(" edgeItem+ ")" edges?
        edgeItem β‰ˆ (edgeVal | ternary) comment*
        edgeVal β‰ˆ (path | name) (edges+ | value)?

        ternary β‰ˆ "(" tern ")" | tern {
            tern β‰ˆ ternIf ternThen ternElse? ternRadio?
            ternIf β‰ˆ (path | name) ternCompare?
            ternThen β‰ˆ "?" (ternary | path | name | value1)
            ternElse β‰ˆ ":" (ternary | path | name | value1)
            ternCompare β‰ˆ compare (path | name | value1)
            ternRadio β‰ˆ "|" ternary
        }
    }
    path β‰ˆ '^(([A-Za-z_][A-Za-z0-9_]*)?[.º˚*]+[A-Za-z0-9_.º˚*]*)'
    name β‰ˆ '^([A-Za-z_][A-Za-z0-9_]*)'
    quote β‰ˆ '^\"([^\"]*)\"'
    num β‰ˆ '^([+-]*([0-9]+[.][0-9]+|[.][0-9]+|[0-9]+[.](https://raw.github.com/musesum/MuPar/main/?![.])|[0-9]+)([e][+-][0-9]+)?)'
    comment β‰ˆ '^([,]+|^[/]{2,}[ ]*(.*?)[\n\r\t]+|\/[*]+.*?\*\/)'
    compare β‰ˆ '^[<>!=][=]?'
    embed β‰ˆ '^[{][{](https://raw.github.com/musesum/MuPar/main/?s)(.*?)[}][}]'
}
"""#

Future

Par is vertically integrated with Flo here

  • Future version Flo may embed Par as a node value type

Bottom up restructuring of parse from user queries

  • Parse tree may be discarded in favor of a parse graph
  • Graph built from user queries as prevNode & nextNode edges as n-grams
  • Match queries by assembling nodes middle-out from dictionary of words and n-gram edges

GitHub

link
Stars: 0
Last commit: 2 weeks ago
jonrohan Something's broken? Yell at me @ptrpavlik. Praise and feedback (and money) is also welcome.

Swiftpack is being maintained by Petr Pavlik | @ptrpavlik | @swiftpackco | API | Analytics