Clustering Methods # MCQs Practice set

Q.1 Which of the following is a partitioning clustering method?

K-Means
DBSCAN
Agglomerative Hierarchical
OPTICS
Explanation - K-Means is a partitioning clustering method that divides data into k non-overlapping subsets or clusters.
Correct answer is: K-Means

Q.2 In hierarchical clustering, which method starts with each object as a separate cluster?

Agglomerative
Divisive
K-Means
DBSCAN
Explanation - Agglomerative hierarchical clustering is a bottom-up approach where each object starts as its own cluster and merges iteratively.
Correct answer is: Agglomerative

Q.3 Which clustering method is density-based?

DBSCAN
K-Means
Hierarchical
Mean Shift
Explanation - DBSCAN groups points closely packed together based on density, and can find arbitrarily shaped clusters.
Correct answer is: DBSCAN

Q.4 What is the main objective of clustering?

To classify data into predefined classes
To group similar objects together
To reduce data dimensions
To generate predictive models
Explanation - Clustering aims to group similar data points into clusters without prior labels, identifying inherent patterns in the data.
Correct answer is: To group similar objects together

Q.5 Which of the following is a measure of cluster quality?

Silhouette coefficient
Mean squared error
Chi-square test
Correlation coefficient
Explanation - The silhouette coefficient measures how similar an object is to its own cluster compared to other clusters.
Correct answer is: Silhouette coefficient

Q.6 Which method of hierarchical clustering merges clusters based on minimum distance between points?

Single linkage
Complete linkage
Average linkage
Ward's method
Explanation - Single linkage considers the minimum distance between points in two clusters when merging, which can lead to chaining effects.
Correct answer is: Single linkage

Q.7 What is the main disadvantage of K-Means clustering?

It requires the number of clusters to be specified
It can handle non-spherical clusters
It works with categorical data easily
It does not require distance metrics
Explanation - K-Means needs the value of k (number of clusters) in advance, which is not always known.
Correct answer is: It requires the number of clusters to be specified

Q.8 Which clustering algorithm is suitable for discovering clusters of arbitrary shape?

DBSCAN
K-Means
K-Medoids
Hierarchical Complete Linkage
Explanation - DBSCAN can discover clusters of arbitrary shapes because it relies on density connectivity instead of spherical assumptions.
Correct answer is: DBSCAN

Q.9 In K-Medoids clustering, what is used as the center of a cluster?

The mean of points
The median point (medoid)
Random point
Maximum distance point
Explanation - K-Medoids uses actual data points (medoids) as cluster centers, making it more robust to outliers than K-Means.
Correct answer is: The median point (medoid)

Q.10 Which of the following is a major step in density-based clustering?

Identifying core points and reachable points
Randomly assigning cluster centers
Computing the mean of each cluster
Performing PCA
Explanation - Density-based clustering identifies core points and builds clusters by connecting density-reachable points.
Correct answer is: Identifying core points and reachable points

Q.11 What type of clustering does the Mean Shift algorithm perform?

Centroid-based
Density-based
Hierarchical
Graph-based
Explanation - Mean Shift is a density-based clustering method that iteratively shifts points towards the mode of the density in the feature space.
Correct answer is: Density-based

Q.12 Which hierarchical clustering method tends to produce compact clusters?

Ward's method
Single linkage
Complete linkage
Average linkage
Explanation - Ward's method minimizes the total within-cluster variance, leading to more compact and spherical clusters.
Correct answer is: Ward's method

Q.13 Which factor does not affect K-Means clustering results?

Initial cluster centroids
Distance metric used
Outliers in data
Number of features not used in clustering
Explanation - Only the features included in clustering, initial centroids, and outliers affect K-Means results.
Correct answer is: Number of features not used in clustering

Q.14 Which clustering technique is most suitable for very large datasets?

K-Means
Agglomerative Hierarchical
DBSCAN
Mean Shift
Explanation - K-Means is computationally efficient for large datasets compared to hierarchical or density-based methods.
Correct answer is: K-Means

Q.15 What is a dendrogram?

A tree diagram showing hierarchical clustering
A measure of cluster compactness
A type of density plot
A centroid in K-Means
Explanation - A dendrogram visually represents the process of hierarchical clustering, showing how clusters are merged or split.
Correct answer is: A tree diagram showing hierarchical clustering

Q.16 Which distance metric is commonly used in K-Means clustering?

Euclidean distance
Manhattan distance
Cosine similarity
Jaccard coefficient
Explanation - Euclidean distance is commonly used in K-Means to measure similarity between points.
Correct answer is: Euclidean distance

Q.17 Which is true about DBSCAN regarding noise points?

They are assigned to the nearest cluster
They are ignored as outliers
They form separate clusters
They are used to calculate centroids
Explanation - DBSCAN labels points that do not belong to any dense region as noise or outliers.
Correct answer is: They are ignored as outliers

Q.18 What is the main advantage of K-Medoids over K-Means?

Handles categorical data and outliers better
Faster computation
No need to specify number of clusters
Automatically finds density-based clusters
Explanation - K-Medoids uses actual data points as centers, which makes it robust to outliers and suitable for categorical data.
Correct answer is: Handles categorical data and outliers better

Q.19 Which clustering method can produce a hierarchy of clusters without specifying the number of clusters in advance?

Hierarchical clustering
K-Means
K-Medoids
DBSCAN
Explanation - Hierarchical clustering builds a tree-like structure of clusters and does not require specifying k in advance.
Correct answer is: Hierarchical clustering

Q.20 In density-based clustering, what is a core point?

A point with enough neighboring points within a radius
A point in the center of a cluster
A point with minimum distance to centroid
A point with no neighbors
Explanation - A core point has a minimum number of neighbors within a specified distance, which forms the basis of density-based clusters.
Correct answer is: A point with enough neighboring points within a radius

Q.21 Which of the following is a limitation of hierarchical clustering?

High computational cost for large datasets
Cannot produce a dendrogram
Requires specifying number of clusters
Does not use distance metrics
Explanation - Hierarchical clustering requires computing distances between all points, making it expensive for large datasets.
Correct answer is: High computational cost for large datasets

Q.22 Which algorithm is better suited for clustering spatial data with noise?

DBSCAN
K-Means
Hierarchical Agglomerative
K-Medoids
Explanation - DBSCAN can identify dense regions and separate noise, making it suitable for spatial data clustering.
Correct answer is: DBSCAN

Q.23 What is the main purpose of the elbow method in K-Means clustering?

To determine optimal number of clusters
To initialize centroids
To calculate cluster density
To remove outliers
Explanation - The elbow method helps identify the number of clusters where adding more clusters does not significantly reduce the sum of squared errors.
Correct answer is: To determine optimal number of clusters

Q.24 Which clustering method is less sensitive to outliers?

K-Medoids
K-Means
Hierarchical single linkage
Mean Shift
Explanation - K-Medoids chooses actual points as centers, making it less sensitive to outliers compared to K-Means.
Correct answer is: K-Medoids