⧼exchistory⧽
5 exercise(s) shown, 0 hidden

For a random forest, let p be the total number of features and m be the number of features selected at each split.

Determine which of the following statements is/are true.

  • When [math]m = p[/math], random forest and bagging are the same procedure.
  • [math]\frac{p-m}{p}[/math] is the probability a split will not consider the strongest predictor.
  • The typical choice of [math]m[/math] is [math]\frac{p}{2}[/math]
  • None
  • I and II only
  • I and III only
  • II and III only
  • The correct answer is not given by (A), (B), (C), or (D).

Copyright 2023. The Society of Actuaries, Schaumburg, Illinois. Reproduced with permission.

  • Created by Admin, May 26'23

Determine which of the following statements about random forests is/are true?

  • If the number of predictors used at each split is equal to the total number of available predictors, the result is the same as using bagging.
  • When building a specific tree, the same subset of predictor variables is used at each split.
  • Random forests are an improvement over bagging because the trees are decorrelated.
  • None
  • I and II only
  • I and III only
  • II and III only
  • The correct answer is not given by (A), (B), (C), or (D).

Copyright 2023. The Society of Actuaries, Schaumburg, Illinois. Reproduced with permission.

  • Created by Admin, May 26'23

Determine which of the following statements regarding statistical learning methods is/are true.

  • Methods that are highly interpretable are more likely to be highly flexible.
  • When inference is the goal, there are clear advantages to using a lasso method versus a bagging method.
  • Using a more flexible method will produce a more accurate prediction against unseen data.
  • I only
  • II only
  • III only
  • I, II and III
  • The correct answer is not given by (A), (B), (C), or (D).

Copyright 2023. The Society of Actuaries, Schaumburg, Illinois. Reproduced with permission.

  • Created by Admin, May 26'23

You are given a dataset with two variables, which is graphed below. You want to predict y using x.

Determine which statement regarding using a generalized linear model (GLM) or a random forest is true.

  • A random forest is appropriate because the dataset contains only quantitative variables.
  • A random forest is appropriate because the data does not follow a straight line.
  • A GLM is not appropriate because the variance of y given x is not constant.
  • A random forest is appropriate because there is a clear relationship between y and x.
  • A GLM is appropriate because it can accommodate polynomial relationships.

Copyright 2023. The Society of Actuaries, Schaumburg, Illinois. Reproduced with permission.

  • Created by Admin, May 26'23

Determine which of the following statements is true

  • Linear regression is a flexible approach
  • Lasso is more flexible than a linear regression approach
  • Bagging is a low flexibility approach
  • There are methods that have high flexibility and are also easy to interpret
  • None of (A), (B), (C), or (D) are true

Copyright 2023. The Society of Actuaries, Schaumburg, Illinois. Reproduced with permission.

  • Created by Admin, Apr 24'24