OK, there's a bug on the analyse side and I think it's in merge-rules. All
the rules that ought to be being generated are being generated, but the rule tree returned by analyse-tokens is incomplete. I'm not yet certain what is wrong.
This commit is contained in:
parent
08b0514908
commit
68fafdab99
|
@ -21,7 +21,7 @@ FIXME: listing of options this app accepts.
|
||||||
|
|
||||||
### Bugs
|
### Bugs
|
||||||
|
|
||||||
...
|
Not so much a bug, but as I've written this all as pure recursive functions it's vulnerable to stack exhaustion exceptions. I've specified extended stack size in the project file, but that won't be sufficient for analysing large texts.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
|
@ -5,4 +5,5 @@
|
||||||
:url "http://www.eclipse.org/legal/epl-v10.html"}
|
:url "http://www.eclipse.org/legal/epl-v10.html"}
|
||||||
:dependencies [[org.clojure/clojure "1.5.1"]]
|
:dependencies [[org.clojure/clojure "1.5.1"]]
|
||||||
:main milkwood-clj.core
|
:main milkwood-clj.core
|
||||||
|
:jvm-opts ["-Xss4m"]
|
||||||
:profiles {:uberjar {:aot :all}})
|
:profiles {:uberjar {:aot :all}})
|
||||||
|
|
|
@ -23,6 +23,7 @@
|
||||||
rules: a rule tree (i.e. a recursively nested map token => rule-tree);
|
rules: a rule tree (i.e. a recursively nested map token => rule-tree);
|
||||||
path: a flat sequence of tokens."
|
path: a flat sequence of tokens."
|
||||||
[rules path]
|
[rules path]
|
||||||
|
(prn "Rule: " path)
|
||||||
(cond
|
(cond
|
||||||
;; if we have no more path, we're done.
|
;; if we have no more path, we're done.
|
||||||
(empty? path) nil
|
(empty? path) nil
|
||||||
|
@ -52,6 +53,7 @@
|
||||||
;; else just continue without adding a rule.
|
;; else just continue without adding a rule.
|
||||||
true (analyse-tokens rules rage (rest tokens) depth)))))
|
true (analyse-tokens rules rage (rest tokens) depth)))))
|
||||||
|
|
||||||
|
|
||||||
(defn analyse-file
|
(defn analyse-file
|
||||||
"Read this file and process it into rules.
|
"Read this file and process it into rules.
|
||||||
|
|
||||||
|
@ -59,3 +61,4 @@
|
||||||
depth: the depth of rules/length of window we're considering"
|
depth: the depth of rules/length of window we're considering"
|
||||||
[file depth]
|
[file depth]
|
||||||
(analyse-tokens nil nil (map (fn [string] (.toLowerCase string)) (re-seq #"\w+\'s|\w+|\p{Punct}" (slurp file))) depth))
|
(analyse-tokens nil nil (map (fn [string] (.toLowerCase string)) (re-seq #"\w+\'s|\w+|\p{Punct}" (slurp file))) depth))
|
||||||
|
|
||||||
|
|
Loading…
Reference in a new issue