EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING · EL VALOR DE LA PALABRA DEL CLIENTE - SAS...

27
Copyright © 2012, SAS Institute Inc. All rights reserved. EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING [email protected]

Transcript of EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING · EL VALOR DE LA PALABRA DEL CLIENTE - SAS...

Page 1: EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING · EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING rosanamac.lean@sas.com. ... documents, audio, video and correspondence.

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

EL VALOR DE LA PALABRA DEL CLIENTE -

SAS TEXT MINING

[email protected]

Page 2: EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING · EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING rosanamac.lean@sas.com. ... documents, audio, video and correspondence.

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

AGENDA

• Datos No Estructurados

• Situación Actual

• Aplicaciones

• SAS® Text Analytics

Page 3: EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING · EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING rosanamac.lean@sas.com. ... documents, audio, video and correspondence.

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

UNSTRUCTURED AND SEMISTRUCTURED DATA

THE DARK MATTER FOR IT

Structured data Relational databases,

structured data files, system/application

data and logs that reside in a data store,

defined by a catalog (table definitions)/data

model accessible via SQL or Object

definitions.

This data has a characteristic of being

contextualized by the heading (field name)

and possibly defined in relation to other

"fields.” This data is also capable of being

processed in a simple manner, summed or

aggregated, etc.

Semistructured data houses structure

with freeform elements (e.g., e-mails) and

has structure and context to specific

elements in the header, but is freeform text

in the body. Semistructured data comes in

many forms.

Semistructured data is also formed when

unstructured data is combined with

metadata, making it accessible by search

engines via indexing schemas. This is the

ideal state for naturally unstructured data

within organizations.

Unstructured data Most

of the information that resides

in organizations is

unstructured in nature –images, content of Web

documents, standard

documents, audio, video and

correspondence.

This type of information is

typically difficult to find

effectively if nothing has been

done to make the data

accessible, such as putting it

into a content management

system and tagging it with

metadata.

70%

25%

5%

Page 4: EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING · EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING rosanamac.lean@sas.com. ... documents, audio, video and correspondence.

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

CURRENT SITUATION:

COMMON QUESTIONS ABOUT TEXTUAL DATA SOURCES

How can I leverage on our textual

data sources?

What value can it bring?

Are there hidden insights within text data

sources that can help my organization?

Such as call center notes, emails, news, online

forums, social media…

How can I leverage on both

unstructured and structured

data sources?

Customer data + Customer

feedback?

Unable to leverage the most from text data?

Can I also use text data

to analyze and

predict the future?

To reduce churn, improve sales,

reduce costs…

Page 5: EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING · EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING rosanamac.lean@sas.com. ... documents, audio, video and correspondence.

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

WHAT IF YOU CAN…

Discover new insights from large text data sources

Extract key patterns from text data to predict the future

Discover current topics about your products from customer opinions

Find patterns within customer feedback, that predicts good interest in upsell

opportunities

Detect anomalies from usual topics described in text reports,

text applications or feedback

Find patterns in reports that may seem to predict/ relate to suspicious behavior

Understand previously unknown issues/ concerns, from citizen discussions on

twitter/ forums

Extract key opinions from citizen feedback to forecast citizen sentiments

in the near future

Customers

Fraud

Public Opinion

Page 6: EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING · EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING rosanamac.lean@sas.com. ... documents, audio, video and correspondence.

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

TEXT ANALYTICS Y

NPS

• Know the key drivers of promoters vs. detractors

• Predict the key drivers on entire customer base

• Measure and visualize the impact of changing

the key drivers of promoters vs. detractors

PROMOTORES

NEUTROS

DETRACTORES

Page 7: EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING · EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING rosanamac.lean@sas.com. ... documents, audio, video and correspondence.

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

TEXT ANALYTICS Y

CHURN

Page 8: EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING · EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING rosanamac.lean@sas.com. ... documents, audio, video and correspondence.

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

SEGUROS: PREDICCIÓN FRAUDE DE SINIESTROS

Correct Dismissal False Detection

False Dismissal Correct Detection

No-Fraude Fraude

Predicción

No

-Fra

ud

eF

rau

de

Actu

al

Mejora del

20%

Mejora del

60%

TEXT ANALYTICS Y

FRAUDE

Page 9: EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING · EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING rosanamac.lean@sas.com. ... documents, audio, video and correspondence.

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

CATEGORIZE AND DETERMINE SENTIMENTSENTIMENT

ANALYSIS

Page 10: EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING · EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING rosanamac.lean@sas.com. ... documents, audio, video and correspondence.

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

SENTIMENT

ANALYSISGOBIERNO

Page 11: EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING · EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING rosanamac.lean@sas.com. ... documents, audio, video and correspondence.

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

SAS® Text Analytics

Acceso & Organización de la Información

SAS Enterprise Content

Categorization

Modelado Predictivo, Descubrimiento de Tendencias &

Patrones

SAS Text MinerSAS Sentiment

Analysis

Experiencia en Temática Modelos Analíticos

Natural Language Processing

Page 12: EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING · EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING rosanamac.lean@sas.com. ... documents, audio, video and correspondence.

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

NATURAL

LANGUAGE

PROCESSING

No me gusta la nueva gaseosa.Negación

Causa & Efecto Compré un nuevo telefóno y tengo mejor señal.

DesambiguaciónParis Hilton está en el Hilton de Paris

La Casa Rosada. La casa es rosada.

Co-referencia Alejandro sabe de TM. Él dijo que me ayudará.

Ortografía Sinónimos

Page 13: EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING · EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING rosanamac.lean@sas.com. ... documents, audio, video and correspondence.

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

NATURAL

LANGUAGE

PROCESSING

Tokenización Identificar palabras/expresiones o tokens

Stemming Identificar variantes: plurales, géneros, conjugaciones

Extracción de Entidades

Nombre de personas, Empresas, Productos, Lugares,

Direcciones email, Números de teléfono. Fechas.

Etiquetado Parte del discurso

Yo paseo mi mascota.

El paseo en lancha. Misma palabra: sustantivo/verbo

Page 14: EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING · EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING rosanamac.lean@sas.com. ... documents, audio, video and correspondence.

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

HOW DOES

TEXT MINING

WORK?

EXPLORING & DISCOVERING INSIGHTS

1. Input text messages – e.g. twitter data,

reports, email, news, forum messages

3. Discover Topics – cluster documents of similar

content and describe them with important key words

2. Parse & explore Text Data –break down text and explore relationships of key concepts

such as persons, places, organizations…

Page 15: EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING · EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING rosanamac.lean@sas.com. ... documents, audio, video and correspondence.

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

HOW DOES

TEXT MINING

WORK?

DISCOVER PATTERNS FOR PREDICTIVE

MODELING

1. Input text messages with relevant

structured data –e.g. email, call center

notes, applications

Customer

data

2. Parse Text Data and Discover Topics – Break

down text into structured data, group messages of

similar content

3. Predictive Modeling with text data – text data input into

models may provide reliable info to predict outcome & behavior

This customer is likely to accept your offer…

Page 16: EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING · EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING rosanamac.lean@sas.com. ... documents, audio, video and correspondence.

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

SENTIMENT

ANALYSIS HOW DOES IT WORK?

1. Input text messages –e.g. twitter data, reports,

email, news, forum messages

Sentiment Taxonomy

2. Parse messages through Sentiment taxonomy –

match and score messages, and their details, for

sentiment polarity (e.g.

message is 80% positive)

3. Output Results – e.g. each message/ document and characteristics within the

document are now associated with a sentiment polarity score

This is negative

This is positive

This is negative

This is positive

This is positive

This is negative

Results are indexed or fed into existing systems

for search & analysis

4. Sentiments Reports –Results are easily analyzed against time period and/or

product features,drillable to see exact message

Page 17: EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING · EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING rosanamac.lean@sas.com. ... documents, audio, video and correspondence.

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

EXAMPLETEXT

ANALYTICS

Data & Sampling Text Analytics Model Testing Model Assessment

& Scoring

Page 18: EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING · EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING rosanamac.lean@sas.com. ... documents, audio, video and correspondence.

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

SAS SENTIMENT

ANALYSIS

Once the taxonomy for the text documents has been

established, rules can be developed to determine the

sentiment of the document within that context. These

rules can be derived through:

• statistical means,

• come from out-of-the-box sets of rules,

• be written by the user, or

• a hybrid of the above.

Below is an example where a customer review of

an airline is found to be of a predominantly positive

sentiment.

Page 19: EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING · EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING rosanamac.lean@sas.com. ... documents, audio, video and correspondence.

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

SAS SENTIMENT

ANALYSIS

Page 20: EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING · EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING rosanamac.lean@sas.com. ... documents, audio, video and correspondence.

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

SAS® TEXT MINER COMMONLY-USED TEXT MINER NODES

• The Text Parsing node parses a document collection in order to quantify information

about the terms that are contained therein.

• The Text Filter node applies filters to reduce the number of terms or documents

included in a text analysis. The Text Filter node must be preceded by a Text Parsing

node and may be preceded by Text Filter and Text Topic nodes.

• The Text Topic node is used to create topics from a document collection. For each

topic it creates, it adds a variable to the training data table which the node exports.

The Text Topic node must be preceded by a Text Parsing node and may also be

preceded by a Text Filter node(s).

• The Text Cluster node performs a statistical cluster analysis of a document collection.

The Text Cluster node must be preceded by Text Parsing and Text Filter nodes and

may also be preceded by a Text Topic node.

Page 21: EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING · EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING rosanamac.lean@sas.com. ... documents, audio, video and correspondence.

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

SAS® TEXT MINER OTHER TEXT MINER NODES

• The Text Import node extracts the text from documents contained in a directory and

creates a data set of the results. The Text Import node can also crawl the Internet,

beginning from a specified URL, and retrieve the Web pages which it finds.

• The Text Profile node is used to associate descriptive terms with different levels of a

target (dependent) variable in the data.

• The Text Rule Builder node creates Boolean rules to predict a categorical target

(dependent) variable. Each rule in the set is associated with a specific target

category. This node must be preceded by Text Parsing and Text Filter nodes.

Page 22: EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING · EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING rosanamac.lean@sas.com. ... documents, audio, video and correspondence.

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

SAS TEXT MINER

Page 23: EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING · EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING rosanamac.lean@sas.com. ... documents, audio, video and correspondence.

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

SAS TEXT MINER

1. From thousands of complaint

messages…

2. Text miner breaks down what

is mentioned into granular

terms/ phrases/ concepts

3. Text Topics automatically discovers

key topics mentioned in the messages,

and list out key words that seem to

describe the topics uniquely

4. we’re able to see what are the key

topics discovered from the complaint

messages – what are the most common

or rare trends/ topics

Page 24: EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING · EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING rosanamac.lean@sas.com. ... documents, audio, video and correspondence.

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

SAS TEXT MINER

Page 25: EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING · EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING rosanamac.lean@sas.com. ... documents, audio, video and correspondence.

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

SAS TEXT MINER

Page 26: EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING · EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING rosanamac.lean@sas.com. ... documents, audio, video and correspondence.

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d .

PREGUNTAS??

Page 27: EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING · EL VALOR DE LA PALABRA DEL CLIENTE - SAS TEXT MINING rosanamac.lean@sas.com. ... documents, audio, video and correspondence.

Copyr i g ht © 2012, SAS Ins t i tu t e Inc . A l l r ights reser ve d . www.SAS.com

GRACIAS!!

[email protected]