|When:||Wednesday May 18, 2016 | Noon - 2 PM|
|Place:||The Penn Club (map)
30 West 44th Street
New York, NY
Call (212) 362-0302 or email
|Cost:||$60 for non-members. Member signup fee discounted from cost for first-time attendees.
$50 for chapter members.
|Topic:||Random Forest versus Logistic Regression|
Different methods generate different variable selection results, reflecting the needs for a comprehensive literature review. These different methods include correlation with dependent variable, p-value, information value, as well as variable importance, and their validities under different conditions.
Default option for Random Forest is biased towards continuous variables, less favor of categorical and binary variables. Unbiased solution is very complex and computationally intensive. There are opportunity to apply random forest variable importance to generate non-linear relationship, scaling, and interaction items to improve logistic regressions.
Senior Director, Citigroup Global Decision Management