Started to look at whether MongoDB would make a useful knowledge store.
This commit is contained in:
parent
72f486bc27
commit
2d395251b5
|
@ -775,7 +775,7 @@ believe'); and, implicit in the qualifier, the possibility of a rebuttal:
|
|||

|
||||
|
||||
In conversation, Toulmin argues, it may be natural simply to say
|
||||
'<data> so <c|aim>' ; to say '<c|aim> because <warrant> because
|
||||
'<data> so <claim>' ; to say '<claim> because <warrant> because
|
||||
<data>' "...strikes us as cumbrous and artificial, for it puts in an extra step which is trivial
|
||||
and unnecessary".
|
||||
|
||||
|
|
|
@ -102,7 +102,7 @@ is true":
|
|||
|
||||

|
||||
|
||||
fig 3: simplest possible rule Conjunctions are represented by columns of
|
||||
fig 1: simplest possible rule Conjunctions are represented by columns of
|
||||
nodes, only the last of which has the colour to be returned if all are
|
||||
true and disjunctions by branches, each of which terminates in the
|
||||
colour to be returned if any are true. These can be combined in any
|
||||
|
@ -111,7 +111,7 @@ individual rule structures small. This is shown in the figure below:
|
|||
|
||||

|
||||
|
||||
fig 4: example rule, showing syntax The rule would read: "(rootnode) is
|
||||
fig 2: example rule, showing syntax The rule would read: "(rootnode) is
|
||||
false unless (first conjunct) is true and (second conjunct) is true, in
|
||||
which case it is true unless either (first disjunct) or (second
|
||||
disjunct) is true".
|
||||
|
@ -262,11 +262,11 @@ our knowledge base contains the following rules:
|
|||
|
||||

|
||||
|
||||
fig 1: Rule for "Entitled to Widow's Allowance"
|
||||
fig 3: Rule for "Entitled to Widow's Allowance"
|
||||
|
||||

|
||||
|
||||
fig 2: rule for "Living with Partner"
|
||||
fig 4: rule for "Living with Partner"
|
||||
|
||||
which, together, partially encode
|
||||
the following legislation fragment, from the Social Security Act 1975
|
||||
|
|
|
@ -48,7 +48,7 @@ So we shall say that a proposition will be represented as a Clojure map with at
|
|||
|
||||
Thus
|
||||
|
||||
{:verb :killed :subject :brutus :object :caesar}
|
||||
{:verb :kill :subject :brutus :object :caesar}
|
||||
|
||||
is a proposition which asserts that Brutus killed Caesar.
|
||||
|
||||
|
@ -61,36 +61,72 @@ There may be many other privileged keys, such as
|
|||
* `:data` - an argument structure...!
|
||||
* `:authority` - id of agent from whom, or rule from which, I know this;
|
||||
|
||||
and so on. The exact set of privileged keys is probably actually a matter for particular advocates rather than for the engine itself, although if the advocates in the game don't broadly share the same set of privileged keys then it won't work very well.
|
||||
and so on. The exact set of privileged keys is probably actually a matter for
|
||||
particular advocates rather than for the engine itself, although if the advocates
|
||||
in the game don't broadly share the same set of privileged keys then it won't
|
||||
work very well.
|
||||
|
||||
*However...*
|
||||
|
||||
The attentive reader will note that some of the proposed privileged keys map closely onto the [Toulmin schema](Analysis.html#the-toulmin-schema). Thus we can say:
|
||||
The attentive reader will note that some of the proposed privileged keys map
|
||||
closely onto the [Toulmin schema](Analysis.html#the-toulmin-schema). Thus we can say:
|
||||
|
||||
* that the proposition itself is a `claim` in the sense of the **C** term;
|
||||
* that `:data` above is precisely `data` in the sense of the **D** term in Toulmin's schema, but may (is likely to) also provide a `warrant` in the sense of the **W** term;
|
||||
* that `:truth` and `:confidence` are both `qualifiers` of the claim in the sense of the **Q** term;
|
||||
* that `:authority` is a form of `backing` in the sense of the **B** term.
|
||||
|
||||
So what, then, is an 'argument structure', as described above? It seems to me that it may be exactly a proposition, with the special feature that the value of the `:data` key is not minimised.
|
||||
So what, then, is an 'argument structure', as described above? It seems to me
|
||||
that it may be exactly a proposition, with the special feature that the value
|
||||
of the `:data` key is not minimised.
|
||||
|
||||
Recall that in the chapter on Arboretum I observed that [the working of the DTree decision algorithm caused precisely those nodes to be collected whose fragments which provided the most relevant explanation](Arboretum.html#relevance-filtering) to support the decision, in a natural sequence from the general to the particular. I believe that precisely the same fortuitous alchemy will provide the argument structure to provide Toulmin's **D** - out `:data` term. The DTree itself then becomes the **W** - the `:warrant`; and the author of the DTree becomes the `:authority`.
|
||||
|
||||
#### Proposition minimisation
|
||||
|
||||
How are the values of `:subject`, `:object` and so on to be passed? If we pass rich knowledge structures around, then we lose the insight that different advocates may know different things about given objects. Thus, while internally within each advocate's knowledge base objects may be stored with rich data, when they're passed around in propositions they should be minimised - that is to say, the value should just be a unique identifier, such that, for every object in the domain, if an advocate knows anything at all about that object, it knows its unique identifier and knows the object by that unique identifier.
|
||||
How are the values of `:subject`, `:object` and so on to be passed? If we pass
|
||||
rich knowledge structures around, then we lose the insight that different
|
||||
advocates may know different things about given objects. Thus, while internally
|
||||
within each advocate's knowledge base objects may be stored with rich data, when
|
||||
they're passed around in propositions they should be minimised - that is to say,
|
||||
the value should just be a unique identifier, such that, for every object in the
|
||||
domain, if an advocate knows anything at all about that object, it knows its
|
||||
unique identifier and knows the object by that unique identifier.
|
||||
|
||||
Thus the unique identifier has something of the nature of a 'true name', in the magical sense. A given true name, a given unique identifier, refers to precisely one thing in the world, and provided that two advocates both know the same true name, they can debats propositions which refer to the object with that true name.
|
||||
Thus the unique identifier has something of the nature of a 'true name', in the
|
||||
magical sense. A given true name, a given unique identifier, refers to precisely
|
||||
one thing in the world, and provided that two advocates both know the same true
|
||||
name, they can debats propositions which refer to the object with that true name.
|
||||
|
||||
Generally, a true name shall be a Clojure keyword. That keyword, passed to any advocate in the game, shall identify either `nil` (the advocate knows nothing of the object), or a map representing everything the advocate knows about the object, and within that map, the value of the key `:id` shall be that true name.
|
||||
Generally, a true name shall be a Clojure keyword. That keyword, passed to any
|
||||
advocate in the game, shall identify either `nil` (the advocate knows nothing
|
||||
of the object), or a map representing everything the advocate knows about the
|
||||
object, and within that map, the value of the key `:id` shall be that true name.
|
||||
|
||||
But in saying 'the advocate knows', actually, the advocate knows nothing. The advocate has access to a knowledge base, and it is in the knowledge base that the knowledge is stored. It may be an individual knowledge base, in which case we can implement that idea that different advocates may have the different knowledge about the same object, or it may be a shared consensual knowledge base.
|
||||
But in saying 'the advocate knows', actually, the advocate knows nothing. The
|
||||
advocate has access to a knowledge base, and it is in the knowledge base that
|
||||
the knowledge is stored. It may be an individual knowledge base, in which case
|
||||
we can implement that idea that different advocates may have the different
|
||||
knowledge about the same object, or it may be a shared consensual knowledge
|
||||
base.
|
||||
|
||||
A proposition is represented as a map. So to minimise a proposition, for every value in that map, if the value is itself a map it shall be replaced by the value of the key `:id` in that map.
|
||||
A proposition is represented as a map. So to minimise a proposition, for every
|
||||
value in that map, if the value is itself a map it shall be replaced by the
|
||||
value of the key `:id` in that map.
|
||||
|
||||
This means that every implementation of the `wildwood.knowledge-accessor/Accessor` protocol must transduce whatever token its backing store uses as the primary key for an object to `:id` when it performs a `fetch` operation.
|
||||
This means that every implementation of the `wildwood.knowledge-accessor/Accessor`
|
||||
protocol must transduce whatever token its backing store uses as the primary key
|
||||
for an object to `:id` when it performs a `fetch` operation.
|
||||
|
||||
## Thoughts on the shape of a knowledge base
|
||||
|
||||
The object of building Bialowieza as a library is that we should not constrain how applications which use the library store their knowledge. Rather, knowledge accessors must transduce between the representation used by the particular storage implementation and that defined in `wildwood.schema`. However, what we've described above suggests that a hierarchical database would be a very natural fit for knowlege base data - more natural, in this case, than a relational database.
|
||||
The object of building Bialowieza as a library is that we should not constrain
|
||||
how applications which use the library store their knowledge. Rather, knowledge
|
||||
accessors must transduce between the representation used by the particular
|
||||
storage implementation and that defined in `wildwood.schema`. However, what
|
||||
we've described above suggests that a hierarchical database would be a very
|
||||
natural fit for knowlege base data - more natural, in this case, than a
|
||||
relational database.
|
||||
|
||||
## Prejudice, and defaults
|
||||
|
||||
|
|
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
4
docs/codox/wildwood.mongo-ka.html
Normal file
4
docs/codox/wildwood.mongo-ka.html
Normal file
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
|
@ -5,7 +5,9 @@
|
|||
:url "https://www.eclipse.org/legal/epl-2.0/"}
|
||||
:dependencies [[org.clojure/clojure "1.8.0"]
|
||||
[org.clojure/math.numeric-tower "0.0.4"]
|
||||
[com.taoensso/timbre "4.10.0"]]
|
||||
[com.taoensso/timbre "4.10.0"]
|
||||
[com.novemberain/monger "3.1.0"]
|
||||
[prismatic/schema "1.1.12"]]
|
||||
:codox {:metadata {:doc "**TODO**: write docs"
|
||||
:doc/format :markdown}
|
||||
:output-path "docs/codox"
|
||||
|
|
44
src/wildwood/mongo_ka.clj
Normal file
44
src/wildwood/mongo_ka.clj
Normal file
|
@ -0,0 +1,44 @@
|
|||
(ns wildwood.mongo-ka
|
||||
"A knowledge accessor fetching from and storing to Mongo DB.
|
||||
|
||||
Hierarchical databases seem a very natural fit for how we're storing
|
||||
knowledge. Mongo DB seems a particularly natural fit since its
|
||||
internal representation is JSON, which can be transformed to EDN
|
||||
extremely naturally."
|
||||
(:require [monger.core :as mg]
|
||||
[monger.collection :as mc]
|
||||
[wildwood.knowledge-accessor :refer [Accessor]])
|
||||
(:import [com.mongodb MongoOptions ServerAddress]
|
||||
[com.mongodb DB WriteConcern]
|
||||
[org.bson.types ObjectId]))
|
||||
|
||||
;; MongoDB data items are identified by ObjectId objects. In the retrieved
|
||||
;; record from MongoDB, key value is the value of a keyword `:_id` I don't
|
||||
;; think there's any *in principle* reason why we should not use these objects
|
||||
;; as key values - they're presumably designed to be globally unique.
|
||||
;;
|
||||
;; In which case, on the way down we have to set `:_id` to the value of `:id`
|
||||
;; and vice versa on the way back up.
|
||||
|
||||
(defrecord MongoKA
|
||||
;; It's not clear to me whether we need to pass both the connection and the
|
||||
;; database in - it's possible that the connected database handle is
|
||||
;; sufficient. The value of `:collection` is the name of the collection
|
||||
;; within the database to which this accessor writes.
|
||||
[connection db ^String collection]
|
||||
Accessor
|
||||
(fetch
|
||||
[_ id]
|
||||
(let [oid (cond
|
||||
(instance? ObjectId id) id
|
||||
(string? id) (ObjectId. id)
|
||||
(keyword? id) (ObjectId. (name id)))
|
||||
record (mc/find-by-id db collection oid)]
|
||||
(when record
|
||||
(assoc
|
||||
(dissoc record :_id)
|
||||
:id id))))
|
||||
(store [_ id proposition]
|
||||
;; don't really know how to do this and am too tired just now.
|
||||
))
|
||||
|
|
@ -29,6 +29,11 @@
|
|||
:authority ;; id of agent from whom, or rule from which, I know this.
|
||||
})
|
||||
|
||||
(def preserved-keys
|
||||
"Keys whose values should not be minimised during proposition minimisation"
|
||||
;; TODO: actually, this may end up being just :data
|
||||
(set (cons :data argument-keys)))
|
||||
|
||||
(defn proposition?
|
||||
"True if `o` qualifies as a proposition. A proposition is probably a map
|
||||
with some privileged keys, and may look something like a minimised
|
||||
|
@ -92,6 +97,8 @@
|
|||
(number? (:confidence o))
|
||||
(<= -1 (:confidence o) 1)))
|
||||
|
||||
(set (cons :data argument-keys))
|
||||
|
||||
(defn minimise
|
||||
"Expecting that `o` is a (potentially rich) proposition, return a map identical
|
||||
to `o` save that for each value `v` of key `k` in `o`, if `v` is a map and `k`
|
||||
|
@ -110,7 +117,7 @@
|
|||
{k
|
||||
(let [v (k o)]
|
||||
(if
|
||||
(and (not (argument-keys k)) (map? v))
|
||||
(and (not (preserved-keys k)) (map? v))
|
||||
(:id v)
|
||||
v))})
|
||||
(keys o)))
|
||||
|
|
Loading…
Reference in a new issue