⧼exchistory⧽
3 exercise(s) shown, 0 hidden

You are performing a K-means clustering algorithm on a set of data. The data has been initialized randomly with 3 clusters as follows:

Cluster Data Point
A (2, –1)
A (–1, 2)
A (–2, 1)
A (1, 2)
B (4, 0)
B (4, –1)
B (0, –2)
B (0, –5)
C (–1, 0)
C (3, 8)
C (–2, 0)
C (0, 0)

A single iteration of the algorithm is performed using the Euclidian distance between points and the cluster containing the fewest number of data points is identified.

Calculate the number of data points in this cluster.

  • 0
  • 1
  • 2
  • 3
  • 4

Copyright 2023. The Society of Actuaries, Schaumburg, Illinois. Reproduced with permission.

  • Created by Admin, May 25'23

You apply 2-means clustering to a set of five observations with two features. You are given the following initial cluster assignments:

Observation X1 X2 Initial cluster
1 1 3 1
2 0 4 1
3 6 2 1
4 5 2 2
5 1 6 2


Calculate the total within-cluster variation of the initial cluster assignments, based on Euclidean distance measure.

  • 32.0
  • 70.3
  • 77.3
  • 118.3
  • 141.0

Copyright 2023. The Society of Actuaries, Schaumburg, Illinois. Reproduced with permission.

  • Created by Admin, May 25'23

Determine which of the following statements about selecting the optimal number of clusters in K-means clustering is/are true.

  • K should be set equal to n, the number of observations.
  • Choose K such that the total within-cluster variation is minimized.
  • The determination of K is subjective and there does not exist one method to determine the optimal number of clusters.
  • I only
  • II only
  • III only
  • I, II and III
  • The correct answer is not given by (A), (B), (C), or (D).

Copyright 2023. The Society of Actuaries, Schaumburg, Illinois. Reproduced with permission.

  • Created by Admin, May 25'23