Swiftpack.co - billdonner/LinkGrubber as Swift Package

Swiftpack.co is a collection of thousands of indexed Swift packages. Search packages.
See all packages published by billdonner.
billdonner/LinkGrubber 0.1.35
Swift Package crawls remote assets, generate csv, json and Publish assets
⭐️ 1
🕓 4 years ago
.package(url: "https://github.com/billdonner/LinkGrubber.git", from: "0.1.35")

LinkGrubber

0.1.35 changed several lets to vars and moved to outer level to increase lifetimes

Publish

Crawl Your Remote Assets

In my case, these are MP3 files from band performances over the years.

There are a few sites here that you can crawl: https://billdonner.github.io/LinkGrubber

Generate CSV an JSON for Data Analysis

LinkGrubber writes a file for Excel or Numbers analsysis of your assets, and a json version for your own programs.

Callback to Your Own File Maker

Typically, as pages of links are grubber you'll want to write a file. It's up to you.

My Static Websites, built on Publish from John Sundell, generates MarkDown Files

first declare some functions needed by the grubber

private struct LgFuncs: LgFuncProts {
    
    func scrapeAndAbsorbFunc ( theURL:URL, html:String ) throws -> ScrapeAndAbsorbBlock {
        try HTMLExtractor.generalScrapeAndAbsorb ( theURL:theURL, html:html )
    }
    func pageMakerFunc(_ props:CustomPageProps,  _ links: [Fav] ) throws -> () {
        // print ("MAKING PAGE with props \(props) linkscount: \(links)")
    }
    func matchingFunc(_ u: URL) -> Bool {
        return  u.absoluteString.hasPrefix("https://billdonner.")
    }
    func isImageExtensionFunc (_ s:String) -> Bool {
        ["jpg","jpeg","png"].includes(s)
    }

}

then make a LinkGrubber and Grub

      do {
                let _ = try LinkGrubber()
                    .grub(roots:[rootstart],
                          opath:"/x/y/z",
                          logLevel: .verbose,
                          lgFuncs: lgFuncs)
                    { crawlerstats in
                        self.grubstats = crawlerstats
                }
            }
            catch {
                print("couldnt grub \(error)")
            }
  

now use .csv file in Numbers or Excel

There's a csv file for you at the end of the day

use .json file for further development endevors

GitHub

link
Stars: 1
Last commit: 4 years ago
Advertisement: IndiePitcher.com - Cold Email Software for Startups

Swiftpack is being maintained by Petr Pavlik | @ptrpavlik | @swiftpackco | API | Analytics