Tau Architecture


Diagrams


. KB Content Orchestrator
  The central dispatcher and coordinator of all activity within a running Tau system
.. Client API
   All client access is processed through the Orchestrator.
.. Client NSI
   Same as the API but using network request-response protocols instead of direct calls
.. Unification / Pattern-Driven KB Content Access
   Lookup based on unification pattern matching (no inference or backtracking)
.. Cross-Reference Lookup
   "Lower-level" iterator-oriented lookup of basic KB content using cross-reference indexes


. KB Content Portal
.. Bookkeeping Generator
   Synthesize and record formulas that track the origin of KB content (optional & configurable)
.. Formula Canonicalizer
   Place formulas asserted by clients into the canonical form required by the active IP
.. Formula Cross-Referencer (source / canonical)
   Record associations between source formulas, canonical equivalents and bookkeeping records
.. Formula Indexer
   Produce term usage indexes that support browsing and pattern-directed lookup

. Language Modules
.. Parsing
.. Content Extraction
   For formats with interleaved presentation and active contents, extract the active portion
.. Synthesis
   Generate source code from internal formulas


. KB Repository
  Basic data structure (a.k.a. "Uniform Database") of knowledge base content.
  Source and Canonical
  Inferentially Active and Inert (e.g. source form)
  Original and Synthesized (e.g., bookkeeping)
  Indexed for fast exploratory browsing and unification lookup


. Inference Processor (IP)
  Inference processing: Query evaluation, truth maintenance, etc.

Draft Proposed American National Standard version of the KIF specification: <http://logic.stanford.edu/kif/dpans.html>


This note outlines the key structural and functional attributes of Tau's front-end source processing subsystem, called "Language Modules."

Language Module

Functions
The principal functions of the language modules within the Tau architecture are to parse and extract content from external source-code formats and to synthesize source-code forms from the internal knowledge representation (the UDB subsystem).


Structure
The Language Module subsystem takes the form of a "white-box" framework of abstract and concrete data types used and extended by each external language form supported within the Tau system. The principal type that must be implemented by each language module is interface tau.io.LanguageModule.


Intrinsic Vocabulary
As a system that implements formal logic (specifically, First-Order Predicate Calculus, or FOPC), there is a basic vocabulary that is universal and intrinsic to the formalisms and notations of FOPC. This intrinsic vocabulary is notated in different ways in different external source-code forms, so one key responsibility of a language module is to associate the notation's external representation for the intrinsic vocabulary with the internally defined and supplied equivalent terms.

MathML Vocabulary
The Tau intrinsic vocabulary is based on that of the MathML 2.0 specification (cf. <http://www.w3c.org/Math/>, <http://www.w3.org/TR/MathML2/> and section 4.4 of <http://www.w3.org/TR/MathML2/PDF-s-MathML-20010221.pdf>, <http://www.w3.org/TR/MathML2/PDF-p-MathML-20010221.pdf>, <http://www.w3.org/TR/MathML2/REC-MathML2-20010221.zip>, <http://www.w3.org/TR/MathML2/XHTML-MathML-20010221.zip> or <http://www.w3.org/TR/MathML2/XML-MathML-20010221.zip>).

Considering the "bulk" of the MathML intrinsic vocabulary (143 terms), Tau chooses to use identical internal names for each of those terms.

Tau-Specific Vocabulary
In addition to the MathML-specified intrinsic vocabulary, there are also several Tau-specific terms:

Vocabulary Data Types
For each term in the Tau intrinsic vocabulary (both MathML and Tau-specific) there is an internally supplied, pre-defined instance of class tau.udb.Term. The language module must choose one of these term instances for each occurrence of a term from the intrinsic vocabulary and either look up or, failing the lookup, synthesize a new Term instance for each term in the input source code.


This list of intrinsic terms was culled from the MathML 2.0 specification (<http://www.w3.org/TR/MathML2/PDF-s-MathML-20010221.pdf>, <http://www.w3.org/TR/MathML2/PDF-p-MathML-20010221.pdf>), section 4.4.


basic content elements
– apply
– reln (deprecated)
– fn (deprecated for externally defined functions)
– interval
– inverse
– sep
– condition
– declare
– lambda
– compose
– ident
– domain (MathML 2.0)
– codomain (MathML 2.0)
– image (MathML 2.0)
– domainofapplication (MathML 2.0)
– piecewise (MathML 2.0)
– piece (MathML 2.0)
– otherwise (MathML 2.0)

arithmetic, algebra and logic
– quotient
– exp
– factorial
– divide
– max and min
– minus
– plus
– power
– rem
– times
– root
– gcd
– and
– or
– xor
– not
– implies
– forall
– exists
– abs
– conjugate
– arg (MathML 2.0)
– real (MathML 2.0)
– imaginary (MathML 2.0)
– lcm (MathML 2.0)
– floor (MathML 2.0)
– ceiling (MathML 2.0)

relations
– eq
– neq
– gt
– lt
– geq
– leq
– equivalent (MathML 2.0)
– approx (MathML 2.0)
– factorof (MathML 2.0)

calculus and vector calculus
– int
– diff
– partialdiff
– lowlimit
– uplimit
– bvar
– degree
– divergence (MathML 2.0)
– grad (MathML 2.0)
– curl (MathML 2.0)
– laplacian (MathML 2.0)

theory of sets
– set
– list
– union
– intersect
– in
– notin
– subset
– prsubset
– notsubset
– notprsubset
– setdiff
– card (MathML 2.0)
– cartesianproduct (MathML 2.0)

sequences and series
– sum
– product
– limit
– tendsto

elementary classical functions
– exp
– ln
– log
– sin
– cos
– tan
– sec
– csc
– cot
– sinh
– cosh
– tanh
– sech
– csch
– coth
– arcsin
– arccos
– arctan
– arccosh
– arccot
– arccoth
– arccsc
– arccsch
– arcsec
– arcsech
– arcsinh
– arctanh

statistics
– mean
– sdev
– variance
– median
– mode
– moment
– momentabout (MathML 2.0)

linear algebra
– vector
– matrix
– matrixrow
– determinant
– transpose
– selector
– vectorproduct (MathML 2.0)
– scalarproduct (MathML 2.0)
– outerproduct (MathML 2.0)

semantic mapping elements
– annotation
– semantics
– annotation-xml

constant and symbol elements
– integers (MathML2.0)
– reals (MathML2.0)
– rationals (MathML2.0)
– naturalnumbers (MathML2.0)
– complexes (MathML2.0)
– primes (MathML2.0)
– exponential (MathML2.0)
– imaginary (MathML2.0)
– notanumber (MathML2.0)
– true (MathML2.0)
– false (MathML2.0)
– emptyset (MathML2.0)
– pi (MathML2.0)
– eulergamma (MathML2.0)
– infinity (MathML2.0)


This note outlines the key structural and content attributes of Tau's Uniform Database (UDB).


The Uniform Database (UDB)

Theory
. Assertions
. Assertion Index
. Name Table
. A single logical theory, comprising its terms and all the assertions that make it up.
. The theory is the largest-scale structural unit into which UDB content is organized.
. Operations
.. Create
   Create a new, empty UDB
.. Initialize
   Install the primary logical vocabulary into a new, blank UDB
.. Duplicate
   Create an exact copy of an existing UDB
.. Assert
   Add a new assertion to the UDB
.. Retract
   Remove an assertion from the UDB
.. Import
   Assert the entire contents of one UDB into another


Assertion
. Logical, arithmetic and / or mathematic expression including terms, variables, text strings, numbers (integer, rational, floating-point, approximated irrationals and complex numbers with vectors and matrices of the aforementioned numeric types)
. Attributes:
.. Formula
.. Category: Source / Canonical / Internal / etc.
.. Origin: Source / Inference / System / External KB / etc.
.. Longevity: Permanent / Transient / etc.
.. Status: Axiom / Rule / Premise / etc. [ ?Jay--Is this distinction relevant? ]
.. Theory of Residence
.. Unique Persistent ID
.. Variable List


Formula
. Operations:
.. Create
.. Duplicate
.. Compare
   Establish the relative (and arbitrary) ordering between two formulae.
.. Substitute
   Replace one or more of the formula's variables, producing a new formula
.. Edit
   Alter the content of the formula


Datum
. The constituents of a Formula, some are atomic and others have discernable internal structure including some which permit construction of recursive formula structures.
. Sub-type hierarchy:
.. Term
   A named, symbolic entity
.. Variable
   A variable (or unknown) in a logical or mathematical formula
.. Number
... Integer
    An integer
... Rational
    A rational number defined by its numerator and denominator
... Float
    An approximation of a real number
... Transcendental
    A transcendental number such as pi or e.
... Complex
    An complex number comprising a real and an imaginary component.
.. Vector
   A linear, one-dimensional sequence of data
.. Matrix
   A rectangular, two-dimensional array of data
.. String
   A Unicode character string
.. Expression
   A nested sub-expression in a complex formula.


Assertion Index
. Collection of indexes designed to facilitate rapid access to assertions based on the terms they reference
. Operations:
.. Term Lookup
   Locate and examine or retrieve, via iterator, visitor or collection, all assertions that make reference to a particular term in a particular argument (or predicate) position.
.. Pattern Lookup
   Locate and examine or retrieve, via iterator, visitor or collection, all assertions whose formula is consistent with a target or pattern formula that may include an arbitrary set of variables. Also known as Unification Lookup.
.. Index Assertion
   Add indexing structures required to include a new assertion in the Assertion Index.


Term Table
. Collection of all terms (symbols) extant in a given theory
. Operations:
.. Find
.. Add
.. Remove
.. Rename


Name Table
. Collection of arbitrary strings supplying the names for the terms in the UDB
. Operations:
.. Find
.. Add
.. Remove


Last update: Sunday, December 16, 2001 09:29:40 AM

Authors: Randall Schulz, Jay Halcomb