|When:||Wednesday Oct. 24, 2012, Noon - 2 PM|
|Place:||The Penn Club, 30 West 44th Street, New York, NY|
|Reservations:|| Call Sam Koslowsky at (212) 502-6717 or email him.
|Cost:||$50 for non-members, $40 for
$5 surcharge for walk-in without reservation.
|Topic:||Variable Selection in the linear Regression model|
|Abstract:||Data mining applications that involve building linear models usually involve variable selection, of which the stepwise family of methods is the most utilized, both for linear as well as for logistic regression.
This presentation discusses methods of and problems in variable selection for present Giga-bases for linear regression, and is intended for practitioners with at least a working knowledge of regression methods.
We introduce the present standard of the stepwise family as well as problems associated with it, such as the issues of redundant and suppressed variables and orthogonalization, and we present an example of the topic. We focus on the issue of wrong coefficient signs and dispel some myths about model interpretation.
Finally, if time permits, we will focus on alternative stopping mechanisms of variable selection that do not depend on statistical inference.
Independent Statistical Consultant
Leonardo Auslender is a statistician (and economist) with more than 25 years of business experience and SAS expertise. His area of expertise is in the area of Giga-Data Analysis and Methods, and has written papers and given lectures on Variable Selection, Missing Value Imputation, Tree Regression, Support Vector Machines, Market-Basket Analysis, Data Base Marketing, CRM, GDP and (Relative Price) Inflation studies, Expectation Formations, Productivity and Technology effects in the economy. He was a lecturer of Finance and Macroeconomics at Rutgers University.
He presented two seminars on Market Basket Analysis in New York City (Informs and Amcis), a two-day seminar at the NYC Direct Marketing Association on Variable and Feature Selection in November, 2004, on Colinearity and Variable Selection at the December 2005 SCMA meeting in Auburn, Alabama, on Modeling issues at the SAS M2007 and M2008 Data Mining Conferences and at the Informs in NYC.