Exercises

You are performing a K-means clustering algorithm on a set of data. The data has been initialized randomly with 3 clusters as follows:

Cluster	Data Point
A	(2, –1)
A	(–1, 2)
A	(–2, 1)
A	(1, 2)
B	(4, 0)
B	(4, –1)
B	(0, –2)
B	(0, –5)
C	(–1, 0)
C	(3, 8)
C	(–2, 0)
C	(0, 0)

A single iteration of the algorithm is performed using the Euclidian distance between points and the cluster containing the fewest number of data points is identified.

Calculate the number of data points in this cluster.

0
1
2
3
4

Created by Admin, May 25'23

You apply 2-means clustering to a set of five observations with two features. You are given the following initial cluster assignments:

Observation	X₁	X₂	Initial cluster
1	1	3	1
2	0	4	1
3	6	2	1
4	5	2	2
5	1	6	2

Calculate the total within-cluster variation of the initial cluster assignments, based on Euclidean distance measure.

32.0
70.3
77.3
118.3
141.0

Created by Admin, May 25'23

Determine which of the following statements about selecting the optimal number of clusters in K-means clustering is/are true.

K should be set equal to n, the number of observations.
Choose K such that the total within-cluster variation is minimized.
The determination of K is subjective and there does not exist one method to determine the optimal number of clusters.

I only
II only
III only
I, II and III
The correct answer is not given by (A), (B), (C), or (D).

Created by Admin, May 25'23