Replaced README with a pointer to new documentation.
This commit is contained in:
parent
10d8574ace
commit
f69fb619cb
78
README.md
78
README.md
|
@ -4,82 +4,6 @@ A Clojure library designed to convert
|
|||
([Enlive](https://github.com/cgrand/enlive)ned) HTML to markdown; but, more
|
||||
generally, a framework for [HT|SG|X]ML transformation.
|
||||
|
||||
## Introduction
|
||||
[Documentation is here](https://simon-brooke.github.io/html-to-md/)
|
||||
|
||||
The itch I'm trying to scratch at present is to transform
|
||||
[Blogger.com](http://www.blogger.com)'s dreadful tag-soup markup into markdown;
|
||||
but my architecture for doing this is to build a completely general [HT|SG|X]ML
|
||||
transformation framework and then specialise it.
|
||||
|
||||
**WARNING:** this is presently alpha-quality code, although it does have fair
|
||||
unit test coverage.
|
||||
|
||||
## Usage
|
||||
|
||||
To use this library in your project, add the following leiningen dependency:
|
||||
|
||||
[org.clojars.simon_brooke/html-to-md "0.3.0"]
|
||||
|
||||
To use it in your namespace, require:
|
||||
|
||||
[html-to-md.core :refer [html-to-md]]
|
||||
|
||||
For default usage, that's all you need. To play more sophisticated tricks,
|
||||
consider:
|
||||
|
||||
[html-to-md.transformer :refer [transform process]]
|
||||
[html-to-md.html-to-md :refer [markdown-dispatcher]]
|
||||
|
||||
The intended usage is as follows:
|
||||
|
||||
```clojure
|
||||
(require '[html-to-md.core :refer [html-to-md]])
|
||||
|
||||
(html-to-md url output-file)
|
||||
```
|
||||
|
||||
This will read (X)HTML from `url` and write Markdown to `output-file`. If
|
||||
`output-file` is not supplied, it will return the markdown as a string:
|
||||
|
||||
```clojure
|
||||
(require '[html-to-md.core :refer [html-to-md]])
|
||||
|
||||
(def md (html-to-md url))
|
||||
```
|
||||
|
||||
If you are specifically scraping [blogger.com](https://www.blogger.com/")
|
||||
pages, you may *try* the following recipe:
|
||||
|
||||
```clojure
|
||||
(require '[html-to-md.core :refer [blogger-to-md]])
|
||||
|
||||
(blogger-to-md url output-file)
|
||||
```
|
||||
|
||||
It works for my blogger pages. However, I'm not sure to what extent the
|
||||
skinning of blogger pages is pure CSS (in which case my recipe should work
|
||||
for yours) and to what extent it's HTML templating (in which case it
|
||||
probably won't). Results not guaranteed, if it doesn't work you get to
|
||||
keep all the pieces.
|
||||
|
||||
## Extending the transformer
|
||||
|
||||
In principle, the transformer can transform any [HT|SG|X]ML markup into any
|
||||
other, or into any textual form. To extend it to do something other than
|
||||
markdown, supply a **dispatcher**. A dispatcher is essentially a function of one
|
||||
argument, a [HT|SG|X]ML tag represented as a Clojure keyword, which returns
|
||||
a **processor,** which should be a function of two arguments, an element assumed
|
||||
to have that tag, and a dispatcher. The processor should return the value that
|
||||
you want elements of that tag transformed into.
|
||||
|
||||
Obviously it is convenient to write dispatchers as maps, but it isn't required
|
||||
that you do so: anything which, given a keyword, will return a processor, will
|
||||
work.
|
||||
|
||||
## License
|
||||
|
||||
Copyright © 2019 Simon Brooke <simon@journeyman.cc>
|
||||
|
||||
Distributed under the Eclipse Public License either version 1.0 or (at
|
||||
your option) any later version.
|
||||
|
||||
|
|
Loading…
Reference in a new issue