Description
Banking Insurance Product – Phase 1: IP – F1.H1
Purpose
By responding to this Request for Proposal (RFP), the Proposer agrees that s/he has read and understood all documents within this RFP package.
Submission Details
Responders to this RFP should supply:
- A business report up to 4 pages (not including cover page, table of contents, or any needed
appendix), including any supporting plots and tables.
- The commented code used to produce the results.
The report should address all points described in the “Objective” section below. The report should be returned in the following way:
• Electronic (submit via Moodle)
Background
The Commercial Banking Corporation (hereafter the “Bank”), acting by and through its department of Customer Services and New Products is seeking proposals for banking services. The Bank ultimately wants to predict which customers will buy a variable rate annuity product.
A variable annuity is a contract between you and an insurance company / bank, under which the insurer agrees to make periodic payments to you, beginning either immediately or at some future date. You purchase a variable annuity contract by making either a single purchase payment or a series of purchase payments.
A variable annuity offers a range of investment options. The value of your investment as a variable annuity owner will vary depending on the performance of the investment options you choose. The investment options for a variable annuity are typically mutual funds that invest in stocks, bonds, money market instruments, or some combination of the three. If you are interested in more information, see: http://www.sec.gov/investor/pubs/varannty.htm
The project will be broken down into 3 phases:
- Phase 1 – Variable Understanding and Assumptions
- Phase 2 – Variable Selection and Modeling Building
- Phase 3 – Model Assessment and Prediction
Objective – Phase 1
The scope of services in this phase includes the following:
• For this phase use only the training data set.
• Explore the predictor variables individually with the target variable of whether the customer bought the insurance product.
o Summarizeonlythesignificantvariablesinatablerankingfrommostsignificanttoleast significant – the Bank currently uses 𝛼 = 0.002, but is open to another if you defend your reason.
§ This table should separate out the four possible classes of variables – binary, ordinal, nominal, continuous.
- § (HINT: Explore the predictor variables individually for now since you have not yet accounted for missing values.)
- § (HINT: The downside to software sometimes is displaying a full p-value for ranking. That doesn’t mean you cannot get them through the right commands. As long as you have the same degrees of freedom you can rank on test statistic as well.)
o Inanappendix,includeatablewithallofthevariablesrankedbysignificance.
• Provide a table of odds ratios for only binary predictor variables in relation to the target
variable.
o Ranktheseoddsratiosbymagnitude.
o Interpretonlythehighestmagnitudeoddsratio. o Reportonanyinterestingfindings.
§ (HINT: This is open-ended and has no correct answer. However, you should get use to keeping an eye out for what you might deem important or interesting when exploring data to report in an executive summary.)
• Provide a summary of results around the linearity assumption of continuous variables.
o Listbothwhichvariablesmeetanddonotmeettheneededassumptionforcontinuous
variables.
o (HINT:Donotgetoverlymathematicalhere.Justreportwhatyoufind;donotteach.)
• Provide a summary of important data considerations as follows:
o Visualrepresentationofwhichvariableshavethehighest(definedbyyoufornow)
amount of missing values.
o ListanycombinationsofvariablesthatyoufeelhaveredundantinformationsotheBank
might consider removing them in the future.
§ (HINT: This is open-ended and has no correct answer. For example, presence of
a money market account and money market balance.) o Reportonanyinterestingfindings.
§ (HINT: This is open-ended and has no correct answer. However, you should get use to keeping an eye out for what you might deem important or interesting when exploring data to report in an executive summary. For example, teller visits as well as other variables might represent human contact with the bank as compared to only online contact.)
Data Provided
The following two sets of data are provided for the proposal:
- The training data set insurance_t contains 8,495 observations and 48 variables.
o Allofthesecustomershavebeenofferedtheproductinthedatasetunderthevariable INS, which takes a value of 1 if they bought and 0 if they did not buy.
o Thereare47variablesdescribingthecustomer’sattributesbeforetheywereoffered the new insurance product.
- The validation data set insurance_v contains 2,124 observations and 48 variables.
- The table below describes the Roles and Description of the variables found in both data sets.
o Except for Branch of Bank, consider anything with more than 10 distinct values as continuous.
Name Model Role Description
ACCTAGE DDA DDABAL DEP DEPAMT CASHBK CHECKS DIRDEP NSF NSFAMT PHONE TELLER SAV SAVBAL ATM ATMAMT POS POSAMT CD CDBAL IRA IRABAL LOC LOCBAL INV INVBAL ILS ILSBAL MM MMBAL MMCRED MTG MTGBAL CC CCBAL CCPURC SDB INCOME HMOWN LORES HMVAL
Input Age of oldest account
Input
Input
Input
Input
Input
Input
Input
Input
Input
Input
Input
Input
Input
Input
Input
Input
Input
Input
Input
Input
Input
Input
Input
Input
Input
Input
Input
Input
Input
Input
Input
Input
Input
Input
Input
Input
Input
Input
Input
Indicator for checking account
Checking account balance
Checking deposits
Total amount deposited
Number of cash back requests
Number of checks written
Indicator for direct deposit
Number of insufficient fund issues
Amount of NSF
Number of telephone banking interactions
Number of teller visit interactions
Indicator for savings account
Savings account balance
Indicator for ATM interaction
Total ATM withdrawal amount
Number of point of sale interactions
Total amount for point of sale interactions
Indicator for certificate of deposit account
CD balance
Indicator for retirement account
IRA balance
Indicator for line of credit
LOC balance
Indicator for investment account
INV balance
Indicator for installment loan
ILS balance
Indicator for money market account
MM balance
Number of money market credits
Indicator for mortgage
MTG balance
Indicator for credit card
CC balance
Number of credit card purchases
Indicator for safety deposit box
Income
Indicator for home ownership
Length of residence in years
Input Value of home
AGE CRSCORE MOVED INAREA INS BRANCH RES
Input
Input
Input
Input
Target
Input
Input
Age
Credit score
Recent address change
Indicator for local address
Indicator for purchase of insurance product
Branch of bank
Area classification