Deliberately added generated documentation to the repo
To see if I can make documentation pages work on github.
This commit is contained in:
parent
cff4492c03
commit
e066c033be
16 changed files with 940 additions and 9 deletions
|
|
@ -1,4 +1,6 @@
|
|||
(ns html-to-md.blogger-to-md
|
||||
"Convert blogger posts to Markdown format, omitting all the Blogger chrome
|
||||
and navigation."
|
||||
(:require [clojure.string :as s]
|
||||
[html-to-md.html-to-md :refer [markdown-dispatcher markdown-header]]
|
||||
[html-to-md.transformer :refer [process]]
|
||||
|
|
@ -34,7 +36,7 @@
|
|||
|
||||
|
||||
(def blogger-dispatcher
|
||||
"Adaptation of `markdown-dispatcher`, q.v., with the `:table`, `:h3` and
|
||||
"Adaptation of `markdown-dispatcher`, q.v., with the `:table` and
|
||||
`:html` dispatches overridden."
|
||||
(assoc markdown-dispatcher
|
||||
:html blogger-scraper
|
||||
|
|
|
|||
|
|
@ -1,4 +1,5 @@
|
|||
(ns html-to-md.core
|
||||
"Top level functions intended for very simple use."
|
||||
(:require [html-to-md.transformer :refer [transform process]]
|
||||
[html-to-md.html-to-md :refer [markdown-dispatcher]]
|
||||
[html-to-md.blogger-to-md :refer [blogger-dispatcher]]))
|
||||
|
|
|
|||
|
|
@ -1,4 +1,7 @@
|
|||
(ns html-to-md.html-to-md
|
||||
"Transform general HTML to
|
||||
[Markdown](https://daringfireball.net/projects/markdown/), as faithfully
|
||||
as is reasonably possible."
|
||||
(:require
|
||||
[clojure.string :as s]
|
||||
[net.cgrand.enlive-html :as html]
|
||||
|
|
@ -165,7 +168,7 @@
|
|||
|
||||
|
||||
(def markdown-dispatcher
|
||||
"A despatcher for transforming (X)HTML into Markdown."
|
||||
"A dispatcher for transforming (X)HTML into Markdown."
|
||||
{:a markdown-a
|
||||
:b markdown-strong
|
||||
:br markdown-br
|
||||
|
|
|
|||
|
|
@ -1,14 +1,38 @@
|
|||
(ns html-to-md.transformer
|
||||
"The actual transformation engine, which is actually far more general
|
||||
than just something to generate
|
||||
[Markdown](https://daringfireball.net/projects/markdown/). It isn't as
|
||||
general as [XSL-T](https://www.w3.org/standards/xml/transformation) but
|
||||
can nevertheless do a great deal of transformation on [HT|SG|X]ML
|
||||
documents.
|
||||
|
||||
## Terminology
|
||||
|
||||
In this documentation the following terminology is used:
|
||||
|
||||
* **dispatcher**: a `dispatcher` is a function (or more
|
||||
probably a map) which takes one argument, the tag of the element as a
|
||||
keyword, and returns a `processor`, q.v.
|
||||
* **processor**: a `processor` is a function of two arguments, an
|
||||
[Enlive](https://github.com/cgrand/enlive) encoded (X)HTML element and
|
||||
a `dispatcher` as described above, which processes elements into the
|
||||
desired format.
|
||||
|
||||
## Generality
|
||||
|
||||
**NOTE** that while `processors` within the `html-to-md` package generally
|
||||
process elements into strings (since Markdown is a text format), when
|
||||
processing into an XML format it will generally be preferable that
|
||||
`processors` should return Enlive style elements."
|
||||
(:require
|
||||
[net.cgrand.enlive-html :as html]
|
||||
[net.cgrand.tagsoup :as tagsoup]))
|
||||
|
||||
|
||||
(defn process
|
||||
"Process this `element`, assumed to be a [HT|SG|X]ML element in Enlive
|
||||
encoding, using this `dispatcher`, assumed to be a function (or more
|
||||
probably a map) which takes one argument, the tag of the element as
|
||||
keyword, and returns a function which processes elements with that tag.
|
||||
"Process this `element`, assumed to be a [HT|SG|X]ML element in
|
||||
[Enlive](https://github.com/cgrand/enlive)
|
||||
encoding, using this `dispatcher`,
|
||||
|
||||
Such a function should take two arguments, the `element` itself and a
|
||||
dispatcher which will normally (but not necessarily) be the `dispatcher`
|
||||
|
|
@ -17,8 +41,13 @@
|
|||
If the dispatcher returns `nil`, the default behaviour is that `process`
|
||||
is mapped over the content of the element.
|
||||
|
||||
If `element` is not an [HT|SG|X]ML element in Enlive encoding or else a
|
||||
string, returns `nil`. Strings are returned unaltered."
|
||||
If `element` is not an [HT|SG|X]ML element in Enlive encoding as descibed
|
||||
above, then
|
||||
|
||||
1. if the `element` is a string, returns that string unaltered;
|
||||
2. if the `element` is a sequence or vector, maps `process` across the
|
||||
members of the sequence;
|
||||
3. otherwise, returns `nil`."
|
||||
[element dispatcher]
|
||||
(cond
|
||||
(:tag element)
|
||||
|
|
@ -32,12 +61,21 @@
|
|||
(remove nil? (map #(process % dispatcher) element))))
|
||||
|
||||
(defn- transformer-dispatch
|
||||
"Hack to get dispatch on just the first argument to the `transform`
|
||||
multi-method."
|
||||
[a _]
|
||||
(class a))
|
||||
|
||||
(defmulti transform
|
||||
"Transform the `obj` which is my first argument using the `dispatcher`
|
||||
which is my second argument."
|
||||
which is my second argument. `obj` can be:
|
||||
|
||||
1. A URL or URI;
|
||||
2. A string representation of a URL or URI;
|
||||
3. A string representation of an (X)HTML fragment;
|
||||
4. An [Enlive](https://github.com/cgrand/enlive) encoded (X)HTML element;
|
||||
5. A sequence of [Enlive](https://github.com/cgrand/enlive) encoded
|
||||
(X)HTML elements."
|
||||
#'transformer-dispatch
|
||||
:default :default)
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue