HW2 Instructions

User Manual:

Open the PDF directly: View PDF PDF.
Page Count: 2

DownloadHW2-Instructions
Open PDF In BrowserView PDF
HOMEWORK 2

INSTRUCTIONS
•

•
•

Every learner should submit his/her own homework solutions. However, you are allowed to
discuss the homework with each other (in fact, I encourage you to form groups and/or use the
forums) – but everyone must submit his/her own solution; you may not copy someone else’s
solution.
The homework will be peer-graded. In analytics modeling, there are often lots of different
approaches that work well, and I want you to see not just your own, but also others.
The homework grading scale reflects the fact that the primary purpose of homework is learning:
Rating
4

3
2
1
0

Meaning
All correct (perhaps except a
few details) with a deeper
solution than expected
Most or all correct
Not correct, but a reasonable
attempt
Not correct, insufficient effort
Not submitted

Point value
(out of 100)
100

90
75
50
0

Question 3.1
Using the same data set (credit_card_data.txt or credit_card_data-headers.txt) as
in Question 2.2, use the ksvm or kknn function to find a good classifier:
(a) using cross-validation (do this for the k-nearest-neighbors model; SVM is optional); and
(b) splitting the data into training, validation, and test data sets (pick either KNN or SVM; the other
is optional).
Question 4.1
Describe a situation or problem from your job, everyday life, current events, etc., for which a clustering
model would be appropriate. List some (up to 5) predictors that you might use.
Question 4.2
The iris data set iris.txt contains 150 data points, each with four predictor variables and one
categorical response. The predictors are the width and length of the sepal and petal of flowers and the
response is the type of flower. The data is available from the R library datasets and can be accessed with
iris once the library is loaded. It is also available at the UCI Machine Learning Repository
(https://archive.ics.uci.edu/ml/datasets/Iris ). The response values are only given to see how well a
specific method performed and should not be used to build the model.

Use the R function kmeans to cluster the points as well as possible. Report the best combination of
predictors, your suggested value of k, and how well your best clustering predicts flower type.



Source Exif Data:
File Type                       : PDF
File Type Extension             : pdf
MIME Type                       : application/pdf
PDF Version                     : 1.7
Linearized                      : No
Page Count                      : 2
Language                        : en-US
Tagged PDF                      : Yes
XMP Toolkit                     : 3.1-701
Producer                        : Microsoft® Word 2016
Creator                         : Wirth, Fatimah
Creator Tool                    : Microsoft® Word 2016
Create Date                     : 2018:08:26 12:21:01-04:00
Modify Date                     : 2018:08:26 12:21:01-04:00
Document ID                     : uuid:C2E74B62-7E99-4A2D-82B1-FCD1F5ADAD6F
Instance ID                     : uuid:C2E74B62-7E99-4A2D-82B1-FCD1F5ADAD6F
Author                          : Wirth, Fatimah
EXIF Metadata provided by EXIF.tools

Navigation menu