CS | 171 | Inroduction to Artificial Int...
CS | 171 | Inroduction to Artificial Intelligence | lecture_notes |Machine Learning (2) Other Classifiers Prof. Alexander Ihler CS 171: Intro to AI Outline • Different types of learning problems • Different types of learning algorithms • Supervised learning – Decision trees – Naïve Bayes – Perceptrons, Multi-layer Neural Networks – Boosting (see papers and Viola-Jones slides on class website) • Applications: learnin
Showing 29-36 out of 61
CS | 171 | Inroduction to Artificial Intelligence ...
CS | 171 | Inroduction to Artificial Intelligence | lecture_notes |Machine Learning (2) Other Classifiers Prof. Alexander Ihler CS 171: Intro to AI Outline • Different types of learning problems • Different types of learning algorithms • Supervised learning – Decision trees – Naïve Bayes – Perceptrons, Multi-layer Neural Networks – Boosting (see papers and Viola-Jones slides on class website) • Applications: learnin
CS | 171 | Inroduction to Artificia...
CS | 171 | Inroduction to Artificial Intelligence | lecture_notes |Machine Learning (2) Other Classifiers Prof. Alexander Ihler CS 171: Intro to AI Outline • Different types of learning problems • Different types of learning algorithms • Supervised learning – Decision trees – Naïve Bayes – Perceptrons, Multi-layer Neural Networks – Boosting (see papers and Viola-Jones slides on class website) • Applications: learnin
Page 29
Linear Classifiers
Linear classifier
ó
single linear decision boundary
(for 2-class case)
We can always represent a linear decision boundary by a linear equation:
w
1
x
1
+ w
2
x
2
+ … + w
d
x
d
=
Σ
w
j
x
j
=
w
t
x
= 0
In d dimensions, this defines a (d-1) dimensional hyperplane
d=3, we get a plane;
d=2, we get a line
For prediction we simply see if
Σ
w
j
x
j
> 0
The w
i
are the weights (parameters)
Learning consists of searching in the d-dimensional weight space for the set of weights
(the linear boundary) that minimizes an error measure
A threshold can be introduced by a “dummy” feature that is always one; its weight
corresponds to (the negative of) the threshold
Note that a minimum distance classifier is a special (restricted) case of a linear
classifier


Page 30
0
1
2
3
4
5
6
7
8
0
1
2
3
4
5
6
7
8
FEATURE 1
FEATURE 2
A Possible Decision Boundary


Page 31
0
1
2
3
4
5
6
7
8
0
1
2
3
4
5
6
7
8
FEATURE 1
FEATURE 2
Another Possible
Decision Boundary


Page 32
0
1
2
3
4
5
6
7
8
0
1
2
3
4
5
6
7
8
FEATURE 1
FEATURE 2
Minimum Error
Decision Boundary


Page 33
The Perceptron Classifier
(pages 729-731 in text)
Input
Attributes
(Features)
Weights
For Input
Attributes
Bias or
Threshold
Transfer
Function
Output


Page 34
The Perceptron Classifier
(pages 729-731 in text)
The perceptron classifier is just another name for a linear
classifier for 2-class data, i.e.,
output(x
) = sign(
Σ
w
j
x
j
)
Loosely motivated by a simple model of how neurons fire
For mathematical convenience, class labels are +1 for one
class and -1 for the other
Two major types of algorithms for training perceptrons
Objective function = classification accuracy (“error correcting”)
Objective function = squared error (use gradient descent)
Gradient descent is generally faster and more efficient – but there
is a problem!
No gradient!


Page 35
Two different types of perceptron output
x-axis below is f(x
) = f
= weighted sum of inputs
y-axis is the perceptron output
f
σ
(
f)
Thresholded output (step function),
takes values +1 or -1
Sigmoid output, takes
real values between -1 and +1
The sigmoid is in effect an approximation
to the threshold function above,
but
has a gradient that we can use for learning
o(f)
f
Sigmoid function is defined as
σ
[ f ] = [ 2 / ( 1 + exp[- f ] ) ] - 1
Derivative of sigmoid
∂σ
/
δ
f [ f ]
= .5
* (
σ
[f]+1 ) * ( 1-
σ
[f] )


Page 36
Squared Error for Perceptron with Sigmoidal Output
Squared error = E[w
]
=
Σ
i
[
σ
(f[x
(i)])
-
y(i) ]
2
where x
(i) is the ith input vector in the training data, i=1,..N
y(i) is the ith target value (-1 or 1)
f[x
(i)]
=
Σ
w
j
x
j
is the weighted sum of inputs
σ
(f[x
(i)]) is the sigmoid of the weighted sum
Note that everything is fixed (once we have the training data)
except for the weights w
So we want to minimize E[w
] as a function of w


Ace your assessments! Get Better Grades
Browse thousands of Study Materials & Solutions from your Favorite Schools
University of California-...
University_of_California-Irvine
School:
Inroduction_to_Artificial_Intelligence
Course:
Great resource for chem class. Had all the past labs and assignments
Leland P.
Santa Clara University
Introducing Study Plan
Using AI Tools to Help you understand and remember your course concepts better and faster than any other resource.
Find the best videos to learn every concept in that course from Youtube and Tiktok without searching.
Save All Relavent Videos & Materials and access anytime and anywhere
Prepare Smart and Guarantee better grades

Students also viewed documents