META Knowledge –

META Reasoning –

META Learning

Dr. Nikolaos Bourbakis

CART, WSU

ALL TOGETHER

META

Reasoning

META

Knowledge

META

Learning

Introduction

• Knowledge and belief are not adequate to formalize

process of inference

• Need to consider all expressions as “objects” so we can

describe these “objects” in terms of inference.

• We consider conceptualizations and PC vocabulary for

it.

• Formal description of inference allow us to better

characterize beliefs.

• Agents believe logical closure of sentences that can be

derived, in time, by given a reference procedure. This

allows the agents can be capable for reasoning and

learning.

Introduction

• Knowledge and belief are not adequate to formalize

process of inference

• Need to consider all expressions as “objects” so we can

describe these “objects” in terms of inference.

• We consider conceptualizations and PC vocabulary for

it.

• Formal description of inference allow us to better

characterize beliefs.

• Agents believe logical closure of sentences that can be

derived, in time, by given a reference procedure. This

allows the agents can be capable for reasoning and

learning.

Introduction

• Agents should be able to observe and describe their

own solving problem processes as well as

understand other agents responses to them;

• Intelligent agents should be able to use results of

these type of interaction-deliberation in controlling

subsequent inference;

• Here we use an abstraction of an agent’s beliefs

that may be different from their physical

description;

META Knowledge

Metaknowledge

• Metaknowledge is knowledge about knowledge.

• Besides knowledge, acquisition of knowledge,

its origin, applicability and dependability,

metaknowledge refers also to knowledge about

what others know, about which information

others need, and about how one’s own

knowledge can be utilized accordingly.

• Mining the existing knowledge to create new

knowledge.

Metalanguage

• Formalization of inference requires conceptualizations and

formal languages to form formal languages;

• Symbols (variables and constants) and operators are

primitive “objects”;

• Complex expressions are sequences of subexpressions

(not characters). For instance ¬P(A+B+C, D) is a sequence

of the operator ¬ and the atomic sentence P(A+B+C, D);

• Example: Large(John) à John is a large person;

for the symbol John a small symbol is used;

Also, the symbol “John” designates the symbol John;

“Father(John)” designates the expression Father(John).

Metalanguage

• By nesting quoted expressions we can define various levels

of languages;

• There are limitations. For example:

(John and Mary agree about what Bill’s phone number is)

We use here the symbol Bel that denotes the relation holding

between each person and the sentences that person believes.

$n Bel(John, “Phone(Bill)=n”) ! Bel(Mary, “Phone(Bill)=n”)

The problem here is that the variable n appearing in the quoted

expressions is taken literally. Because both John and believe

the sentence Phone(Bill)=n

Metalanguage

Solution for the problem:

Renaming the expressions by using sequence notation in denoting

an expression, instead of using the quoted symbol for an

expression;

Example:

The expression ¬P(A+B+C, D) can be denoted either as symbol

“¬P(A+B+C, D)” or as list [“¬”, “P(A+B+C, D)”].

In addition, we can denote P(A+B+C, D) either as symbol

“P(A+B+C, D)” or as list [“P”, “A+B+C”, “D”].

Further more we can say “A+B+C” or list [“+”,“A”,“B”,”C”].

This approach offers a better solution for the phone number

problem.

Metalanguage

$n Bel(John,[“=“,“Phone(Bill)”,n]) ! Bel(Mary,[“=“,“Phone(Bill)”,n])

Using quotation’s and list’s benefits the final expression can be

written as a combination:

$n Bel(John, “Phone(Bill)=

• Metalevel language contains the relation constant Objconst,

Funconst, Relconst, and Variable. Thus, an example can be:

Variable(“x”)

Objconst(John”)

Funconst(“Father”)

Relconst(“Large”)

Clausal Form

• The metalevel language let us define other languages, so we

define the syntax of clausal form;

• A Constant is either an object constant, a function constant,

or a relation constant;

“x Constant(x) ßàObjconst(x) ! Funconst(x) ! Relconst(x)

• A term is either an object constant, a variable, or a functional

expression

“x Term(x) ßà Objconst(x) ! Variable(x) ! Funexpr(x)

• A term list is an ordered list of terms

“l Term(l) ßà(“x Member(x, l) à Term(x))

• A functional expression is an expression made up of a

function constant and a list of terms;

“f“l Funexpr(f.l) ßà (Funconst(f) ” Termlist(l))

Clausal Form

• An atomic sentence is an expression made up of a relation

constant and a suitable list of terms;

“r“l Atom(r.l) ßà (Relconst(r) ! Termlist(l))

• A literal is either an atomic sentence or the negation of an

atomic sentence;

“x Literal(x) ßà (Atom(x) ” ($z x=”¬

• A clause is defined as a list of ordered literals;

“c Clause(c) ßà(“x Member(x, c) à Literal(x))

• A database is defined as an unordered set of clauses. Here we

considered a database an ordered list of clauses;

“d Database(d) ßà(“x Member(x, d) à Clause(x))

Resolution Principle

• Resolution Principle (RP) is a rule of Inference that derives

conclusion from a pair of premises;

• Substitution à list of pairs by associating a variable with its

replacement: i.e. [“x”/”F(z)”, “y”/”B”];

• Binary function Subst :

(1)“x Subst(x, []) =x;

(2) “x“s Constant(x) à Subst(x,s) = x;

(3) “x“z“s Variable(x) à Subst(x,(x/z).s)=z;

(4)“x“y“z“s Variable(x) ! y≠x à Subst(x,(y/z).s)=Sub(x.s)

(5) “x“l“s Subst(x. l, s)=Subst(x,s) . Subst(l, s)

Resolution Principle

A substitution can include a binding for a new variable by

substituting the value into the bindings for the variables in the

initial substitution and then adding the new association to the

old substitution:

“x“z Extend([ ],x,z) = [x/z]

“u“v“x“z“s Extend((u,v).s,x,z) = (u/Subst(v,[x/z])).Extend(s,x,z)

Two substitutions can be combined by incrementally extending

one with each of the elements of the other, as described by

axioms

“s Combine(s, [ ]) = s

“s“t“x“z Combine(s,(x/z)).t) = Combine(Extend(s,x,z),t)

Resolution Principle: Unification

Example: (substitution)

P(x,x,y,v){x/A, y/F(B), z/w} = P(A,A,F(B),v)

• A substitution τ is distinct from a substitution σ if

and only if no variable bound in σ occurs anywhere

in τ; (although variables with bindings in τ may occur

in σ)

Resolution Principle: Unification

A most general unifier (mgu) γ of Φ and Ψ has the

property that, if σ is any unifier of two expressions , then

there exists a substitution δ with the following property:

Φγδ = Φσ =Ψσ

A mgu is unique up to variable renaming level;

Example: {x/A} is mgu for the expressions P(A,y,z) and

P(x,y,z)

The less general unifier {x/A,y/B,z/C} can be generated by

the mgu and a substitution {y/B,z/C}.

{x/A} {y/B,z/C}

Resolution Principle

• Use of the most general unifier (mgu) holds two between two

expressions and their mgu, if exists.

• The mgu of 2 composite expressions is the mgu of the parts of

the expressions.

“x Mgu(x,x, [ ])

“x“y Variable(x) ! ¬Among(x,y) à Mgu(x,y, [x/y])

“x“y ¬Variable(x) ! Variable(y) ! ¬Among(y,x) à Mgu(x,y, [y/x])

“x“y“l“m“s“t Mgu(x,y,s) ! Mgu(Subst(l, s), Subst(m,s),t) à

Mgu(x. l, y.m, Combine(s,t))

Resolution Principle

• Thus we use mgu to define resolution inference;

• If a clause begins with x and another with with a negative literal

(the argument of which unifies with x) then one resolvent of the

2 clauses is obtained by substituting the unifier into the clause

formed of the remaining elements of the 2 clauses:

“x“y“s Mgu(x,y,s) ßà

Resolvent{x. l, “¬

• (where A function Append([1,2], [3,4]) same as [1,2,3,4])

“m Append([],m) = m;

“x“l“m Append(x.l,m)=(x.Append(l,m))

Resolution Principle

• In general, we allow resolution on any literals in the two clauses;

For example: If a literal x is an element of a clause, and ¬ y is an

element of another clause and there is a mgu for x and y, then

the resolvent (common mgu) of the two clauses is formed by

deleting the complementary literals, appending the remaining

literals and applying the unifier (we also rename the remaining

variables).

“c“d“x“y“s Member(x,c) ! Member(“¬

Resolvent{c,d,Subst(Append(Delete((x,c), Delete(“¬

Thus, we will use this RP definition to formulate the Resolution

strategies

Inference Procedure: Rules of Inference

• Resolution Principle (RP) is a rule of Inference that

derives conclusion from a pair of premises;

• Here, RP is a ternary relation that holds 3 clauses

whenever the the third clause is a resolvent of the

first 2 clauses;

• The basic element in resolution is unification and

the basic element in unification is the notion of

substitution;

Rules of Inference

• Inference is the process of deriving conclusions from premises,

in one or more steps.

• Each step must be sanctioned by an acceptable rule of

inference.

• A rule of inference consist of:

(1) a set of sentence patterns, called conditions; and

(2) another set of sentence patterns, called conclusions.

• WHEN we have sentences the match the conditions, THEN it is

acceptable to infer sentences matching the conclusions.

Rules of inference

• Modus ponens (MP): φ => ψ

φψ

Example: On(A, B) => Above(A, B)

On(A, B)

Above(A, B)

conditions

conclusion

conditions

conclusion

• Modus tolens (MT): φ => ψ

¬ψ

¬φ

Example: On(A, B) => Above(A, B)

¬Above(A, B)

¬ On(A, B)

conditions

conclusion

conditions

conclusion

Rules of inference

• And elimination (AE): φ ψ

φψ

Example:

Clear (A) Table (A)

Clear (A)

Table (A)

conclusions

condition (conjunction)

condition

conclusions

Rules of inference

• And introduction (AI): φ

ψ

φ ψ

Example:

Clear (A)

Table (A)

Clear (A) Table (A)

conclusions

condition

conditions

conclusion (conjunction)

(conjunction)

Rules of inference

• Universal instantiation (UI): “ν φ

φν/τ

Examples: “y Loves(Jane, y) or Loves(Jane, Jill) |
“y Loves(Jane, y) Loves(Jane, y) |

AssignmentTutorOnline

(conclusion with free variable be careful to avoid conflict with other variables !!) |
(can | y be many things ?) |

(inference from general

to particular)

Rules of inference

• Universal instantiation (UI): “ν φ

φν/τ

Examples:

“y $z Loves(y, z) $z Loves(Mom(x), z) |
or | “y $z Hates(y, z) $z Hates(Mom(z), z) |

Thus, τ is free for ν in φ (if and only if ν does not occur

within the scope of a quantifier of some variable in τ)

Mom(x) is a free variable from y Mom(z) is NOT a free variable, so

we cannot substitute it !!!

(inference from general

to particular)

Rules of inference

• Existential instantiation (EI): $ν φ

φν/π(ν1,…,ν2)

where π is a new function constant

where ν1,…,ν2 are free variables in φ

It allows us to

eliminate existential

quantifiers

* The free variables in the replacement term capture the

relationship between the values of $ variable and the free

variables in the expression.

* If there are no free variables in an expression the variable

can be replaced by a new constant

Rules of inference

• Existential instantiation (EI): $ν φ

φν/π(ν1,…,ν2)

Examples: $z Hates(y, z) Note: $y “x Hates(x, y)

Hates(y, Foe(y)) ≠

“x $y Hates(x, y)

$y “x Hates(x, y) NOT: $z Hates(Jill, z)

“x Hates(x, Mike) Hates(Jill, Jill)

It allows us to

eliminate existential

quantifiers

If Foe is new function constant

Rules of inference

Inference Procedure

• Inference procedure is defined as a function of that maps the

initial database ! and a positive number n into the DB for

the n-th step of inference on !;

• We use the function constant Step to denote an arbitrary

inference procedure;

• Markov inference procedure (MIP) is a function that maps a

DB into a successor DB; MIP (named Next)

“d Step (d,1) = d

“d“n n>1 à Step(d,n) = Next(Step(d, n-1))

• Markov is a memory less procedure, but we can consider

history based procedure;

Inference Procedure

For Example:

• Define function Concs (that maps a clause and a DB into a list of

all Resolvents, where the specified clause is a parent and in

which the other parent is a member of the specified DB;

“c Concs(c, [ ]) = [ ]

“c“d“e“l Resolvent(c,d,e) à Concs(c,d, l) = e.Concs(c, l)

“c“d“e“l ¬Resolvent(c,d,e) à Concs(c,d, l) = Concs(c, l)

• The initial DB is obtained by adding the clause (with answer literal)

obtained from the negation of the query to the front of the DB. On

each step, the procedure removes the first element of the DB and

adds all one-step conclusions to the remainder of the DB.

Next(d) = Append(Concs(Car(d), d), Car(d))

Inference Procedure

Example: The goal here is to show that there is a z such that R(z) is

true.

In the first step the goal clause is removed and replaced by 2 subgoals;

Then the first of these is then reduced to a further subgoal on the second step

Then this goal is resolved with a unit clause to produce the empty clause,

……

Derivability and Belief

• Derivability is conceptualized as a binary relation between a DB

and an individual sentence;

• We know that “a sentence is is derivable from a DB if and only

if (a) it is in the DB or (b) can be derived from the DB using RI”;

• Using the Resolvent relation we formalize this definition:

“d“r Derivable(d, r) ßà Member(d, r) ! = ($p$q Derivable(d, p)

” Derivable(d, q) ” Resolvent(p, q, r))

To eliminate the peculiarity of this process (above) we use the

restricted derivability;

Derivability and Belief

So we can say a sentence is derivable by a RP Step from an

initial DB if and only if on some step the procedure produces a

DB which contains that sentence;

“d“p Derivable(d, p) ßà ($n Member(p, Step (d, n)))

Since Resolution is not generatively complete, but refutation

complete, the Derivability may generate the refutation of

sentences;

So we say that “a sentence is provable by RP if and only if the

empty clause can be derived by that RP from the clauses in the

DB and those in the clause form of the negative sentence;

“d“p Provable(d, p)ßà Derivable(Apend(Clauses(“¬

”,d),[ ])

Derivability and Belief

• The function Clauses maps a sentence into a list of the clauses

in its clausal form;

• We use the notion of provability to define the meaning of “an

agent believes a sentence”

Firstly we assume that there is function Data that maps an agent

into a list of sentences explicitly stored in that agent’s DB. Then

we define a belief as a binary relation that “holds” between an

agent and a sentence if and only if the sentence is provable fron

the agent’s DB .

“d“p Provable(d, p)ßà Provable(Data(a), p)

This allows the inference procedure of agents to be described

declaratively;

META Reasoning

Metalevel Reasoning

“Experts report on the latest artificial

intelligence research concerning reasoning

about reasoning itself. The capacity to think

about our own thinking may lie at the heart of

what it means to be both human and

intelligent”.

•Reasoning of the reasoning

Metalevel Reasoning

• One of the advantages of encoding metalevel knowledge as

sentences in predicate calculus is that we can use automated

reasoning procedures in answering questions about any

reasoning process described;

• So reasoning about reasoning we mean metalevel reasoning or

metareasoning;

• For the previous formalization of knowledge about reasoning

procedures we have assumed definitions for relations

(Variable, Objconst, Funcost, and Relconst) and we have

exploited relationships between symbols and lists (Variable(x)

is true, and symbol “P[A,B]” designate term [“P”, “A”, “B”].

• This is a problem if we encode such info in metalanguage.

Metalevel Reasoning

Problem:

Since there are “infinitely” many symbols and we cannot

quantify

over parts of symbols, we need “infinitely” many axiom to

completely define such relationships;

For example: We can consider a metareasoning procedure

based on resolution. In this procedure we explicitly encode

info about fundamental type relations by adding appropriate

procedural attachments. Thus we deal with equivalence of

quoted symbols and lists of quoted by modifying the unifier;

Metalevel Reasoning

The procedural attachments for the four relations

(Variable, Objconst, Funcost, and Relconst) are similar

to one another;

For example: Consider a clause that contains one literal of the

form Variable(“!”), where ! stands for any expression in our

language; but we don’t literally mean the symbol ! , which is not

an expression of the language;

Now, if ! is a variable then the literal is true and the clause can

be dropped from the DB (since it cannot be use to derive an

empty clause);

If ! is anything other than a variable, then the literal is false and

can be dropped from the clause;

Now for clauses that contain a literal ¬Variable(“!”), the results

are reversed;

Metalevel Reasoning

• This is a procedure for computing the most general unifier (mgu)

Metalevel Reasoning

The following picture present an illustration of the entire

reasoning procedure (using the definition derivability) by

considering that the empty clause is derivable from a DB

consisting of the two clauses [Q] and [¬Q].

META Learning

Learn how to Learning

Basic Machile Learning Schemes

Next Level Machine Learning Schemes

Basic Machine Learning Methods

•Knowledge Repetition

•Resolution

•Inference

•Non monotonic Reasoning

• Generalization – Induction

Next Level Machine Learning Methods

• Regression

• Classification

• Clustering

• Dimensionality Reduction

• Ensemble Methods

• Neural Nets and Deep Learning

• Transfer Learning

• Reinforcement Learning

• Natural Language Processing

• Word Embeddings

wikipedia

Next Level Machine Learning Methods

• Regression

Regression methods fall within

the category of supervised ML.

They help to predict or explain

a particular numerical value

based on a set of prior data, for

example predicting the price of

a property based on previous

pricing data for similar

properties.

Linear Regression Model Estimates of

Building’s Energy Consumption (kWh).

Next Level Machine Learning Methods

•Classification

Another class of supervised ML, classification

methods predict or explain a class value. For

example, they can help predict whether or not

an online customer will buy a product. The

output can be yes or no. But classification

methods aren’t limited to two classes. For

example, a classification method could help to

assess whether a given image contains a car or

a truck. In this case, the output will be 3

different values: 1) the image contains a car, 2)

the image contains a truck, or 3) the image

contains neither a car nor a truck.

Logistic Regression Decision Boundary:

Admitted to College or Not?

Next Level Machine Learning Methods

•Clustering

With clustering methods, we get into the

category of unsupervised ML because

their goal is to group or cluster

observations that have similar

characteristics. Clustering methods don’t

use output information for training, but

instead let the algorithm define the

output. In clustering methods, we can only

use visualizations to inspect the quality of

the solution.

Clustering Buildings into Efficient (Green) and

Inefficient (Red) Groups.

Next Level Machine Learning Methods

•Dimensionality Reduction

As the name suggests, we use dimensionality

reduction to remove the least important

information (sometime redundant columns)

from a data set. In practice, I often see data

sets with hundreds or even thousands of

columns (also called features), so reducing

the total number is vital. For instance, images

can include thousands of pixels, not all of

which matter to your analysis. Or when

testing microchips within the manufacturing

process, you might have thousands of

measurements and tests applied to every

chip, many of which provide redundant

information. In these cases, you need

dimensionality reduction algorithms to make

the data set manageable.

t-SNE Iterations on MNIST Database of

Handwritten Digits

Next Level Machine Learning Methods

•Ensemble Methods

Imagine you’ve decided to build a bicycle because you are not feeling

happy with the options available in stores and online. You might begin

by finding the best of each part you need. Once you assemble all these

great parts, the resulting bike will outshine all the other options.

Ensemble methods use this same idea of combining several predictive

models (supervised ML) to get higher quality predictions than each of

the models could provide on its own. For example, the Random Forest

algorithms is an ensemble method that combines many Decision Trees

trained with different samples of the data sets. As a result, the quality

of the predictions of a Random Forest is higher than the quality of the

predictions estimated with a single Decision Tree.

Next Level Machine Learning Methods

• Neural Nets and Deep Learning

In contrast to linear and logistic regressions

which are considered linear models, the

objective of neural networks is to capture nonlinear patterns in data by adding layers of

parameters to the model. In the image, the

simple neural net has three inputs, a single

hidden layer with five parameters, and an output

layer.

It’s especially difficult to keep up with

developments in deep learning, in part because

the research and industry communities have

doubled down on their deep learning efforts,

spawning whole new methodologies every day.

Neural Net with one Hidden Layer

Deep Learning: Neural Network with

Many Hidden Layers

Next Level Machine Learning Methods

•Transfer Learning

Let’s pretend that you’re a data scientist working in the

retail industry. You’ve spent months training a highquality model to classify images as shirts, t-shirts and

polos. Your new task is to build a similar model to

classify images of dresses as jeans, cargo, casual, and

dress pants. Can you transfer the knowledge built into

the first model and apply it to the second model? Yes,

you can, using Transfer Learning.

Next Level Machine Learning Methods

•Reinforcement Learning

Imagine a mouse in a maze trying to find hidden pieces of cheese. The

more times we expose the mouse to the maze, the better it gets at

finding the cheese. At first, the mouse might move randomly, but after

some time, the mouse’s experience helps it realize which actions bring

it closer to the cheese.

The process for the mouse mirrors what we do with Reinforcement

Learning (RL) to train a system or a game. Generally speaking, RL is a

machine learning method that helps an agent learn from experience.

By recording actions and using a trial-and-error approach in a set

environment, RL can maximize a cumulative reward. In our example,

the mouse is the agent and the maze is the environment. The set of

possible actions for the mouse are: move front, back, left or right.

Next Level Machine Learning Methods

•Natural Language Processing

A huge percentage of the world’s data and knowledge is in some form of

human language. Can you imagine being able to read and comprehend

thousands of books, articles and blogs in seconds? Obviously, computers can’t

yet fully understand human text but we can train them to do certain tasks. For

example, we can train our phones to autocomplete our text messages or to

correct misspelled words. We can even teach a machine to have a simple

conversation with a human.

Natural Language Processing (NLP) is not a machine learning method per se,

but rather a widely used technique to prepare text for machine learning. Think

of tons of text documents in a variety of formats (word, online blogs, ….). Most

of these text documents will be full of typos, missing characters and other

words that needed to be filtered out. At the moment, the most popular

package for processing text is NLTK (Natural Language ToolKit), created by

researchers at Stanford.

Next Level Machine Learning Methods

•Word Embeddings

TFM and TFIDF are numerical representations of text documents

that only consider frequency and weighted frequencies to represent

text documents. By contrast, word embeddings can capture the

context of a word in a document. With the word context,

embeddings can quantify the similarity between words, which in

turn allows us to do arithmetic with words.

Word2Vec is a method based on neural nets that maps words in a

corpus to a numerical vector. We can then use these vectors to find

synonyms, perform arithmetic operations with words, or to

represent text documents (by taking the mean of all the word

vectors in a document).

Advanced Machine Learning Methods:

Synergies of F&S

•Formal Learning cannot generalized beyond a

point; But it is very good at low level learning

via reasoning;

•Statistical Learning cannot efficiently

reasoning at low levels; but it is good for large

scale reasoning and learning;

•Thus, Mixing Formal and Statistical is the best

solution for AI problems.

END

- Assignment status: Already Solved By Our Experts
*(USA, AUS, UK & CA PhD. Writers)***CLICK HERE TO GET A PROFESSIONAL WRITER TO WORK ON THIS PAPER AND OTHER SIMILAR PAPERS, GET A NON PLAGIARIZED PAPER FROM OUR EXPERTS**

**NO PLAGIARISM**– CUSTOM PAPER