- Bias vs Variance trade-off
- Model estimation time vs
- Explain SVM algorithm. What does it mean to be a high-margin classifier?
- Explain the k-means clustering algorithm
- Do you consider yourself a frequentist or Bayesian?
- How would you cross-validate a time series model?
- Techniques for handling imbalanced data sets?
- L2 vs L1 normalization
- How do you select features to use in a model?
- Features = 50, Samples = 20,000 How many trees would you use in a Random Forests implementation? Splitting criteria?
- What is the Box-Cox transformation used for? used to stabilize the variance (eliminate heteroskedasticity) and also to normalize a distribution How does it work?
- Fill missing data in this data set