OK, there's a bug on the analyse side and I think it's in merge-rules. All

the rules that ought to be being generated are being generated, but the rule
tree returned by analyse-tokens is incomplete. I'm not yet certain what is
wrong.
This commit is contained in:
Simon Brooke 2013-11-08 12:58:46 +00:00
parent 08b0514908
commit 68fafdab99
3 changed files with 5 additions and 1 deletions

View file

@ -21,7 +21,7 @@ FIXME: listing of options this app accepts.
### Bugs
...
Not so much a bug, but as I've written this all as pure recursive functions it's vulnerable to stack exhaustion exceptions. I've specified extended stack size in the project file, but that won't be sufficient for analysing large texts.

View file

@ -5,4 +5,5 @@
:url "http://www.eclipse.org/legal/epl-v10.html"}
:dependencies [[org.clojure/clojure "1.5.1"]]
:main milkwood-clj.core
:jvm-opts ["-Xss4m"]
:profiles {:uberjar {:aot :all}})

View file

@ -23,6 +23,7 @@
rules: a rule tree (i.e. a recursively nested map token => rule-tree);
path: a flat sequence of tokens."
[rules path]
(prn "Rule: " path)
(cond
;; if we have no more path, we're done.
(empty? path) nil
@ -52,6 +53,7 @@
;; else just continue without adding a rule.
true (analyse-tokens rules rage (rest tokens) depth)))))
(defn analyse-file
"Read this file and process it into rules.
@ -59,3 +61,4 @@
depth: the depth of rules/length of window we're considering"
[file depth]
(analyse-tokens nil nil (map (fn [string] (.toLowerCase string)) (re-seq #"\w+\'s|\w+|\p{Punct}" (slurp file))) depth))