Crisp Dm User Guide
User Manual: Pdf
Open the PDF directly: View PDF .
Page Count: 14

1
The CRISP-DM User Guide
Brussels SIG Meeting
Pete Chapman
NCR Systems Engineering Copenhagen
email: Pete.Chapman@Copenhagen.NCR.com

2
Agenda
■CRISP-DM Objectives and Benefits
■CRISP-DM Deliverables
■CRISP-DM Methodology, Phases and Tasks
■CRISP-DM User Guide
■Possible CRISP-DM Futures

3
Objectives and Benefits of CRISP-DM
◆ensure quality of knowledge discovery project results
◆reduce skills required for knowledge discovery
◆reduce costs and time
◆general purpose (i.e., stable across varying applications)
◆robust (i.e., insensitive to changes in the environment)
◆tool and technique independent
◆tool supportable
◆support documentation of projects
◆capture experience for reuse
◆support knowledge transfer and training

4
CRISP-DM Deliverables
◆Process Model
◆Methodology
◆Reference Model
◆User Guide
◆Output (Deliverable/Templates)
◆Tool Support
◆Tool Support Definitions
◆Stream Library
◆Experimentation
◆Experimentation Reports
◆CRISP-DM SIG User Feedback

5
CRISP-DM Methodology
Mapping
Phases
Generic Tasks
CRISP
Process Model
Specialized Tasks
Process Instances CRISP
Process

6
Data Mining Contexts
Specialized Tasks
Generic Tasks
Application Domains
• Response Modeling
• Churn Prediction
•...
Technical Aspects
• Missing Values
• Outliers
•...
Problem Types
• Data Description / Summarization
• Segmentation
• Concept Description
• Predictive Modeling
• Dependency Analysis
Tools and Techniques
• Clementine
• MineSet
• Decision Trees
•...

7
CRISP-DM Phases
Data
Understanding
Data
Preparation
Modelling
Data
Data
Data
Business
Understanding
Deployment
Evaluation
Data
Understanding
Data
Preparation
Modelling
Data
Data
Data
Business
Understanding
Deployment
Evaluation

8
Phases and Tasks
Business
Understanding Data
Understanding Evaluation
Data
Preparation Modeling
Determine
Business Objectives
Background
Business Objectives
Business Success
Criteria
Situation Assessment
Inventory of Resources
Requirements,
Assumptions, and
Constraints
Risks and Contingencies
Terminology
Costs and Benefits
Determine
Data Mining Goal
Data Mining Goals
Data Mining Success
Criteria
Produce Project Plan
Project Plan
Initial Asessment of
Tools and Techniques
Collect Initial Data
Initial Data Collection
Report
Describe Data
Data Description Report
Explore Data
Data Exploration Report
Verify Data Quality
Data Quality Report
Data Set
Data Set Description
Select Data
Rationale for Inclusion /
Exclusion
Clean Data
Data Cleaning Report
Construct Data
Derived Attributes
Generated Records
Integrate Data
Merged Data
Format Data
Reformatted Data
Select Modeling
Technique
Modeling Technique
Modeling Assumptions
Generate Test Design
Test Design
Build Model
Parameter Settings
Models
Model Description
Assess Model
Model Assessment
Revised Parameter
Settings
Evaluate Results
Assessment of Data
Mining Results w.r.t.
Business Success
Criteria
Approved Models
Review Process
Review of Process
Determine Next Steps
List of Possible Actions
Decision
Plan Deployment
Deployment Plan
Plan Monitoring and
Maintenance
Monitoring and
Maintenance Plan
Produce Final Report
Final Report
Final Presentation
Review Project
Experience
Documentation
Deployment

9
Introduction to the User Guide
Generic Tasks
Specialized Tasks
Context
Reference Model
What To Do?
User Guide
How To Do?
• check lists
• questionaires
• tools
• sequences of steps
• decision points
•pitfalls

10
CRISP-DM User Guide

11
How to use the User Guide (i)
◆Contents of the User Guide
- More detailed description of the various tasks using:
◆Activities List
◆Check Lists
◆Good Ideas
◆Warnings!
◆What is NOT in the User Guide
◆Deliverables/Document Templates (as yet)
◆Description of Techniques and Tools (as yet)
◆Estimates of engagements
◆Quality Indicators

12
How to use the User Guide (ii)
◆Beginning Data Miners
◆What tasks do I need to do?
◆What is the order of the tasks in a Data Mining Engagement?
◆What risks do I run?
◆Are there any “shortcuts” in my tasks?
◆What are the format of the deliverables that I need to resent to management?
◆Experienced Data Miners
◆Have I missed any activity?
◆Are there any tasks or activity that I can leave until later?
◆How can I make a Project Plan?
◆How can I document the project for later re-use?

13
Possible Future CRISP-DM Deliverables
◆“CRISP-DM - The Book ”, includes
◆Experiences, feedback from SIG members
◆Reference Model, User Guide updated with experiments
◆Full Deliverables/Document Templates
◆Case Studies
◆Mapping Advice from Generic to Specific Engagements
◆More explicit advice on Tools & Techniques
◆Advice on documentation of engagements,
establishment of Data Mining Library,…..

14