JAEM'2007: Aprendizaje Supervisado de Reglas Difusas mediante un Sistema Clasificador Evolutivo...

33
Aprendizaje Supervisado de Reglas Aprendizaje Supervisado de Reglas Difusas mediante un Sistema Clasificador Evolutivo Estilo Michigan Albert Orriols-Puig 1,2 Albert Orriols Puig Jorge Casillas 2 Ester Bernadó-Mansilla 1 1 Grup de Recerca en Sistemes Intel·ligents Enginyeria i Arquitectura La Salle, Universitat Ramon Llull 2 Dept. de Ciencias de la Computación e IA Universidad de Granada

description

 

Transcript of JAEM'2007: Aprendizaje Supervisado de Reglas Difusas mediante un Sistema Clasificador Evolutivo...

Aprendizaje Supervisado de Reglas Aprendizaje Supervisado de Reglas Difusas mediante un Sistema

Clasificador Evolutivo Estilo Michigan

Albert Orriols-Puig1,2Albert Orriols PuigJorge Casillas2

Ester Bernadó-Mansilla1

1Grup de Recerca en Sistemes Intel·ligentsEnginyeria i Arquitectura La Salle, Universitat Ramon Llull

2Dept. de Ciencias de la Computación e IAUniversidad de Granada

Motivation

Michigan-style LCSs for supervised learning. Eg. UCS

Evolve online highly accurate models– Evolve online highly accurate models

– Competitive to the most-used machine learning techniques• (Bernadó et al, 03; Wilson, 02; Bacardit & Butz, 04; Butz, 06; Orriols & Bernadó, 07)

Main weakness: Intepretability of the rule sets– Continuous attributes represented with intervals: [li, ui] . Semantic-p [ i, i]

free variables

– Number of rules or classifiers• Reduction schemes (Wilson, 02; Fu & Davis, 02; Dixon et al., 03, Orriols & Bernadó, 2005)

Slide 2GRSI Enginyeria i Arquitectura la Salle

(Wilson, 02; Fu & Davis, 02; Dixon et al., 03, Orriols & Bernadó, 2005)

Motivation

Jorge’s Proposal:Let’s “fuzzify” UCS– Let’s fuzzify” UCS

• Change the rule representation to fuzzy rules

Framework on Michigan-style Learning Fuzzy-Classifier Systems (LFCS)

– (Valenzuela-Radón, 91 & 98)

– (Parodi & Bonelli, 93)

– (Furuhashi, Nakaoka & Uchikawa, 94)

– (Velasco, 98)

– (Ishibuchi, Nakashima & Murata, 99 & 05): First LFCS for pattern classification

– (Casillas, Carse & Bull, 07) Fuzzy-XCS

Slide 3GRSI Enginyeria i Arquitectura la Salle

Aim

Propose Fuzzy-UCSAccuracy based Michigan style LFCS– Accuracy-based Michigan-style LFCS

– Supervised learning scheme

– Derived from UCS (Bernadó & Garrell, 2003)

• Introduction of a linguistic fuzzy representation

• Modification of all operators that deal with rules

– We expect:We expect:• Achieve similar performance than UCS

• Higher interpretability• Higher interpretability

– Plus new opportunities:

Slide 4GRSI Enginyeria i Arquitectura la Salle

• Mine in uncertain environments

Outline

1 D i i f F UCS1. Description of Fuzzy-UCS

2 Experimental Methodology2. Experimental Methodology

3. Results3. Results

4. Conclusions

Slide 5GRSI Enginyeria i Arquitectura la Salle

Description of UCS1. Description of Fuzzy-UCS2. Experimental Methodology3. Results4 C l i

p4. Conclusions

Michigan-style LCS’s (Holland, 1975):– Derived from XCS (Wilson 1995) a reinforcement learning– Derived from XCS (Wilson, 1995), a reinforcement learning

method.

– Designed specifically for supervised learningDesigned specifically for supervised learning

Rule representation:C ti i bl t d i t l [l ]– Continuous variables represented as intervals: [li, ui]

– Eg:IF x1 Є [l1, u1] ^ x2 Є [l2, u2] … ^ xn Є[ln, nn] THEN class1

– Matching instance e: for all ei: li ≤ ei ≤ ui

IF x1 Є [l1, u1] x2 Є [l2, u2] … xn Є[ln, nn] THEN class1

– Set of parameters: Accuracy, Fitness, Numerosity, Experience, Correct set size

Slide 6GRSI Enginyeria i Arquitectura la Salle

Description of UCS1. Description of Fuzzy-UCS2. Experimental Methodology3. Results4 C l i

p4. Conclusions

Environment

M t h S t [M]P bl i t

Stream ofexamples

Population [P]

1 C A acc F num cs ts exp3 C A acc F num cs ts exp5 C A acc F num cs ts exp

Match Set [M]Problem instance+

output class

1 C A acc F num cs ts exp2 C A acc F num cs ts exp3 C A acc F num cs ts exp4 C A acc F num cs ts exp

Population [P]

Classifier

6 C A acc F num cs ts exp…

correct set4 C A acc F num cs ts exp5 C A acc F num cs ts exp6 C A acc F num cs ts exp

ClassifierParameters

UpdateMatch set generation

Correct Set [C]

correct setgeneration

ExperienceCorrectacc #

=Selection, Reproduction, mutation

Deletion 3 C A acc F num cs ts exp6 C A acc F num cs ts exp

Correct Set [C]

p

νaccFitness =Genetic AlgorithmIf there are no classfiers in [C], covering is triggered

Slide 7GRSI Enginyeria i Arquitectura la Salle

Description of Fuzzy-UCSp y

Describe the different components1. Rule representation and matching1. Rule representation and matching

2. Learning interaction

3 Di t3. Discovery component

4. Fuzzy-UCS in test mode

Slide 8GRSI Enginyeria i Arquitectura la Salle

Description of Fuzzy-UCS1. Description of Fuzzy-UCS2. Experimental Methodology3. Results4 C l i

p y

Rule representation

4. Conclusions

Rule representation– Linguistic fuzzy rules

– E.g.: IF x1 is A1 and x2 is A2 … and xn is An THEN class1

Disjunction of linguistic

All i bl h th ti

Disjunction of linguistic fuzzy terms

– All variables share the same semantics

– Example: Ai = {small, medium, large}

IF x1 is small and x2 is medium or large THEN class1

– Codification:

Slide 9GRSI Enginyeria i Arquitectura la Salle

IF [100 | 011] THEN class1

Description of Fuzzy-UCS1. Description of Fuzzy-UCS2. Experimental Methodology3. Results4 C l i

p y

How do we know if a given input is small, medium or large?

4. Conclusions

g p , g– Each linguistic term defined by a membership function

Belongs to medium with a degree of 0 8Belongs to medium with a degree of 0.8

Belongs to small with a degree of 0 2

ei

Belongs to small with a degree of 0.2

Triangular-shaped membership functions

Attribute valuemembership functions

Slide 10GRSI Enginyeria i Arquitectura la Salle

Description of Fuzzy-UCS1. Description of Fuzzy-UCS2. Experimental Methodology3. Results4 C l i

p y

Matching degree uAk(e) [0,1]

4. Conclusions

g g A ( ) [ , ]

k: IF x1 is small and x2 is medium or large THEN class1

Example: (e1, e2)

0 8

0.2 0.2

0.8

T-conorm: bounded sum

e1 e2

max ( 1, 0.8 + 0.2) = 1

T-norm: productk

Slide 11GRSI Enginyeria i Arquitectura la Salle

uAk(e) = 1 * 0.2 = 0.2

Description of Fuzzy-UCS1. Description of Fuzzy-UCS2. Experimental Methodology3. Results4 C l i

p y4. Conclusions

The role of matching changes:• UCS: A rule matches or not an example (binary function)• Fuzzy-UCS: A rule matches an example with a certain degree

Slide 12GRSI Enginyeria i Arquitectura la Salle

Description of Fuzzy-UCS1. Description of Fuzzy-UCS2. Experimental Methodology3. Results4 C l i

p y4. Conclusions

Each classifier has the following parameters:1 Weight per class w :1. Weight per class wj:

• Soundness with which the rule predicts the class j.

• The class value is dynamic and corresponds to the class j with higher w• The class value is dynamic and corresponds to the class j with higher wj

2. Fitness:• Quality of the rule

3. Other parameters directly inherited from UCS:• numerosity

• Experience

Slide 13GRSI Enginyeria i Arquitectura la Salle

Description of Fuzzy-UCSp y

Describe the different components1. Rule representation and matching1. Rule representation and matching

2. Learning interaction

3 Di t3. Discovery component

4. Fuzzy-UCS in test mode

Slide 14GRSI Enginyeria i Arquitectura la Salle

Description of Fuzzy-UCS1. Description of Fuzzy-UCS2. Experimental Methodology3. Results4 C l i

p y4. Conclusions

Learning interaction:– The environment provides an example e and its class c

– Match set creation: all classifiers that match with uAk(x) > 0

– Correct set creation: all classifiers that advocate cCorrect set creation: all classifiers that advocate c

– Covering: if there is not a classifier that maximally matches e• Create the classifier that match the input example with maximumCreate the classifier that match the input example with maximum

degree.

• Generalize the condition with probability P#

A2A1 A3For each variable:

Slide 15GRSI Enginyeria i Arquitectura la Salle

Description of Fuzzy-UCS1. Description of Fuzzy-UCS2. Experimental Methodology3. Results4 C l i

p y4. Conclusions

Parameters’ UpdateExperience:– Experience:

– Sum of correct matching per class j cmj:

Slide 16GRSI Enginyeria i Arquitectura la Salle

Description of Fuzzy-UCS1. Description of Fuzzy-UCS2. Experimental Methodology3. Results4 C l i

p y4. Conclusions

Parameters’ UpdateUse cm to update of the weights per each class:– Use cm to update of the weights per each class:

• Rule that only matches instances of class c:

• wc = 1

• For all the other classes j: wj = 0

• Rule that matches instances of all classes:

– Calculate the fitness

u e t at atc es sta ces o a c asses

• All weights wi ranging [0, 1]

Pressuring toward rules that correctly match instances of

only one classonly one class

Slide 17GRSI Enginyeria i Arquitectura la Salle

Description of Fuzzy-UCSp y

Describe the different components1. Rule representation and matching1. Rule representation and matching

2. Learning interaction

3 Di t3. Discovery component

4. Fuzzy-UCS in test mode

Slide 18GRSI Enginyeria i Arquitectura la Salle

Description of Fuzzy-UCS1. Description of Fuzzy-UCS2. Experimental Methodology3. Results4 C l i

p y4. Conclusions

Discovery componentSteady state niched GA– Steady-state niched GA

– Roulette wheel selectionInstances that have a highergmatching degree have more

opportunities of being selected

Slide 19GRSI Enginyeria i Arquitectura la Salle

Description of Fuzzy-UCS1. Description of Fuzzy-UCS2. Experimental Methodology3. Results4 C l i

p y4. Conclusions

Discovery componentCrossover and mutation applied on the antecedent– Crossover and mutation applied on the antecedent

• 2 point crossoverIF [100 | 011] THEN classIF [100 | 011] THEN class1IF [101 | 100] THEN class1

• Mutation:

– Expansion IF [100 | 011] THEN class1 IF [101 | 011] THEN class1p

– Contraction

[ | ] 1 [ | ] 1

IF [100 | 011] THEN class1 IF [100 | 001] THEN class1

– Shift IF [100 | 011] THEN class1 IF [010 | 011] THEN class1

Slide 20GRSI Enginyeria i Arquitectura la Salle

Description of Fuzzy-UCSp y

Describe the different components1. Rule representation and matching1. Rule representation and matching

2. Learning interaction

3 Di t3. Discovery component

4. Fuzzy-UCS in test mode

Slide 21GRSI Enginyeria i Arquitectura la Salle

Description of Fuzzy-UCS1. Description of Fuzzy-UCS2. Experimental Methodology3. Results4 C l i

p y4. Conclusions

Class inference of a test example e– Combining the information of all rules yields better results than

taking a single rule for reasoning (Cordon et al. 1998)

• Inference:

– All experienced rules vote for the class they predict as: uAk(e) · Fk

– The most voted class is returned.

Slide 22GRSI Enginyeria i Arquitectura la Salle

Outline

1 D i i f F UCS1. Description of Fuzzy-UCS

2 Experimental Methodology2. Experimental Methodology

3. Results3. Results

4. Conclusions

Slide 23GRSI Enginyeria i Arquitectura la Salle

Experimental Methodology1. Description of Fuzzy-UCS2. Experimental Methodology3. Results4 C l i

p gy4. Conclusions

Evaluating Fuzzy-UCS’ performance– Compare Fuzzy-UCS’ accuracy to:– Compare Fuzzy-UCS accuracy to:

• Three non-fuzzy learners: UCS, SMO, and C4.5

• Two fuzzy learners: Fuzzy LogitBoost and Fuzzy GP• Two fuzzy learners: Fuzzy LogitBoost and Fuzzy GP

– Default configuration for all methods

F UCS fi ti– Fuzzy-UCS configuration:iter = 100,000, N = 6400, F0 = 0.99, v=10, {θGA, θdel, θsub} = 50,x =0.8, u=0.04, P#=0.6x 0.8, u 0.04, P# 0.6

– Fuzzy learners: 5 linguistic labels per variable

10 fold cross validation– 10-fold cross-validation

– Averages over 10 runs

Slide 24GRSI Enginyeria i Arquitectura la Salle

Experimental Methodology1. Description of Fuzzy-UCS2. Experimental Methodology3. Results4 C l i

p gy4. Conclusions

Data domains

#Inst #Fea #Re #In #No #Cl %Min %Max %MisAtt

annealing 898 38 6 0 32 5 0.9 76.2 0

balance 625 4 4 0 0 3 7.8 46.1 0

bupa 345 6 6 0 0 2 42 58 0

glass 214 9 9 0 0 6 4 2 35 5 oglass 214 9 9 0 0 6 4.2 35.5 o

heart-c 303 13 6 0 7 2 45,5 54.5 15,4

heart-s 270 13 13 0 0 2 44.4 56.6 0

iris 150 4 4 0 0 3 33.3 33.3 0

wbcd 699 9 0 9 0 2 34.5 65.5 11,1

wine 178 13 13 0 0 3 27 39 9 0wine 178 13 13 0 0 3 27 39.9 0

zoo 101 17 0 1 16 7 4 40.6 0

Slide 25GRSI Enginyeria i Arquitectura la Salle

Outline

1 D i i f F UCS1. Description of Fuzzy-UCS

2 Experimental Methodology2. Experimental Methodology

3. Results3. Results

4. Conclusions

Slide 26GRSI Enginyeria i Arquitectura la Salle

Results1. Description of Fuzzy-UCS2. Experimental Methodology3. Results4 C l i4. Conclusions

• 1st objective: Competitive in terms of performance

Slide 27GRSI Enginyeria i Arquitectura la Salle

Results1. Description of Fuzzy-UCS2. Experimental Methodology3. Results4 C l i4. Conclusions

• 2nd objective: Improve the interpretability

Example of rules evolved by UCS for iris

• 2nd objective: Improve the interpretability

Example of rules evolved by UCS for iris

Example of rules evolved by Fuzzy-UCS for iris – Linguistic terms: {XS, S, M, L, XL}

Slide 28GRSI Enginyeria i Arquitectura la Salle

Further work1. Description of Fuzzy-UCS2. Experimental Methodology3. Results4 C l i4. Conclusions

Still large rule-sets!

Fuzzy UCS UCSFuzzy-UCS UCS

annealing 2769 4494b l 1212 2177balance 1212 2177bupa 1440 2961glass 2799 3359glass 2799 3359heart-c 3574 2977heart-s 2415 3735iris 480 1039wbcd 3130 2334wine 3686 3685zoo 773 1291

Slide 29GRSI Enginyeria i Arquitectura la Salle

Solution: New inference schemes

Further work1. Description of Fuzzy-UCS2. Experimental Methodology3. Results4 C l i4. Conclusions

Still large rule-sets!

Fuzzy-UCS Fuzzy UCS UCSybest rule Fuzzy-UCS UCS

annealing 36 2769 4494b l 75 1212 2177balance 75 1212 2177bupa 39 1440 2961glass 36 2799 3359glass 36 2799 3359heart-c 46 3574 2977heart-s 62 2415 3735iris 7 480 1039wbcd 28 3130 2334wine 26 3686 3685zoo 10 773 1291

Slide 30GRSI Enginyeria i Arquitectura la Salle

Solution: New inference schemes

Outline

1 D i i f F UCS1. Description of Fuzzy-UCS

2 Experimental Methodology2. Experimental Methodology

3. Results3. Results

4. Conclusions

Slide 31GRSI Enginyeria i Arquitectura la Salle

Conclusions and Further Work1. Description of Fuzzy-UCS2. Experimental Methodology3. Results4 C l i4. Conclusions

Conclusions– We proposed a Michigan-style LFCS for supervised learning– Competitive with respect to:

• Some of the most-used machine learners: UCS, SMO, and C4.5• Recent proposals of Fuzzy-learners: Fuzzy LogitBoost and Fuzzy GP

– Improvement in terms of interpretability with respect to UCS

Further workEvolve more reduced populations– Evolve more reduced populations

– Enhance the comparison with new real-world problems– Compare to other LFCS– Compare to other LFCS– Exploit the incremental learning approach to dig large datasets

Slide 32GRSI Enginyeria i Arquitectura la Salle

Aprendizaje Supervisado de Reglas Aprendizaje Supervisado de Reglas Difusas mediante un Sistema

Clasificador Evolutivo Estilo Michigan

Albert Orriols-Puig1,2Albert Orriols PuigJorge Casillas2

Ester Bernadó-Mansilla1

1Grup de Recerca en Sistemes Intel·ligentsEnginyeria i Arquitectura La Salle, Universitat Ramon Llull

2Dept. de Ciencias de la Computación e IAUniversidad of Granada