Unsupervised Learning # MCQs Practice set

Q.1 Which of the following is an example of unsupervised learning?

Spam email detection
Customer segmentation
House price prediction
Stock price forecasting
Explanation - Unsupervised learning is used when labeled data is not available. Customer segmentation groups customers based on patterns without predefined labels.
Correct answer is: Customer segmentation

Q.2 Which algorithm is commonly used for clustering in unsupervised learning?

Linear Regression
Decision Tree
K-Means
Naïve Bayes
Explanation - K-Means is a popular clustering algorithm in unsupervised learning that groups data into k clusters based on similarity.
Correct answer is: K-Means

Q.3 Principal Component Analysis (PCA) is primarily used for:

Classification
Regression
Dimensionality reduction
Clustering
Explanation - PCA reduces the number of dimensions while retaining most of the variance, making it a dimensionality reduction technique in unsupervised learning.
Correct answer is: Dimensionality reduction

Q.4 In K-Means clustering, 'K' stands for:

Number of features
Number of clusters
Number of iterations
Number of data points
Explanation - In K-Means, 'K' specifies the number of clusters into which the data will be divided.
Correct answer is: Number of clusters

Q.5 Which of the following is NOT an unsupervised learning task?

Clustering
Dimensionality reduction
Classification
Association rule mining
Explanation - Classification requires labeled data and hence belongs to supervised learning, not unsupervised learning.
Correct answer is: Classification

Q.6 Hierarchical clustering produces:

Decision boundaries
A regression model
A dendrogram
Probability distributions
Explanation - Hierarchical clustering creates a tree-like diagram called a dendrogram to represent nested clusters.
Correct answer is: A dendrogram

Q.7 Association rule mining is often applied in:

Market basket analysis
Regression tasks
Speech recognition
Spam filtering
Explanation - Association rule mining identifies patterns such as frequently purchased item sets, commonly used in market basket analysis.
Correct answer is: Market basket analysis

Q.8 Which distance metric is commonly used in clustering?

Manhattan distance
Euclidean distance
Cosine similarity
All of the above
Explanation - Clustering can use various distance metrics including Euclidean, Manhattan, and Cosine similarity, depending on the data type.
Correct answer is: All of the above

Q.9 Unsupervised learning deals with:

Labeled data
Unlabeled data
Partially labeled data
Reinforcement signals
Explanation - Unsupervised learning identifies hidden patterns in data without using labels.
Correct answer is: Unlabeled data

Q.10 Which of these is a density-based clustering algorithm?

DBSCAN
K-Means
Linear Regression
Random Forest
Explanation - DBSCAN groups data points based on density, unlike K-Means which relies on centroids.
Correct answer is: DBSCAN

Q.11 Dimensionality reduction helps mainly in:

Reducing computational cost
Improving data visualization
Eliminating redundancy
All of the above
Explanation - Dimensionality reduction improves efficiency, visualization, and reduces redundancy, making analysis easier.
Correct answer is: All of the above

Q.12 Which technique can be used for anomaly detection in unsupervised learning?

K-Means clustering
Support Vector Machines
Naïve Bayes
Logistic Regression
Explanation - K-Means can be used to identify data points far from cluster centroids, helping in anomaly detection.
Correct answer is: K-Means clustering

Q.13 Self-organizing maps (SOMs) are mainly used for:

Classification
Clustering and visualization
Regression
Time series forecasting
Explanation - SOMs are neural networks that reduce dimensions and help in clustering and visualizing high-dimensional data.
Correct answer is: Clustering and visualization

Q.14 In PCA, the principal components are chosen to maximize:

Mean
Variance
Error
Distance
Explanation - Principal components maximize variance to capture the most information in fewer dimensions.
Correct answer is: Variance

Q.15 Which of these is an application of clustering?

Customer segmentation
Image compression
Document classification
All of the above
Explanation - Clustering is directly used in customer segmentation, while image compression and document classification often require supervised methods.
Correct answer is: Customer segmentation

Q.16 Which method reduces noise in data by grouping similar points?

Regression
Clustering
Classification
Reinforcement learning
Explanation - Clustering groups similar points together, which helps reduce noise and highlight patterns.
Correct answer is: Clustering

Q.17 Gaussian Mixture Models (GMMs) are based on:

Probability distributions
Decision trees
Neural networks
Support vectors
Explanation - GMM assumes data is generated from a mixture of Gaussian probability distributions.
Correct answer is: Probability distributions

Q.18 Which evaluation metric is commonly used for clustering?

Accuracy
Silhouette score
Precision
Recall
Explanation - Silhouette score evaluates how similar a point is to its cluster compared to other clusters, useful for clustering validation.
Correct answer is: Silhouette score

Q.19 Which unsupervised method is often used in recommendation systems?

Collaborative filtering
Logistic regression
Naïve Bayes
Linear regression
Explanation - Collaborative filtering is an unsupervised technique widely used in recommendation systems to group users and items.
Correct answer is: Collaborative filtering

Q.20 Which algorithm can handle non-linearly separable clusters?

K-Means
DBSCAN
Linear regression
Logistic regression
Explanation - DBSCAN can detect arbitrary-shaped clusters, unlike K-Means which assumes spherical clusters.
Correct answer is: DBSCAN

Q.21 Clustering is an example of:

Supervised learning
Unsupervised learning
Reinforcement learning
Semi-supervised learning
Explanation - Clustering does not require labeled data and is a key unsupervised learning task.
Correct answer is: Unsupervised learning

Q.22 The 'elbow method' is used to determine:

Number of iterations
Number of clusters
Number of features
Number of data points
Explanation - The elbow method helps find the optimal number of clusters by plotting variance against cluster count.
Correct answer is: Number of clusters

Q.23 Anomaly detection in network security often uses:

Classification models
Clustering algorithms
Linear regression
Decision trees
Explanation - Clustering is used to detect unusual patterns in network traffic that may indicate anomalies.
Correct answer is: Clustering algorithms

Q.24 Which unsupervised learning technique reduces dimensionality by learning latent features?

Autoencoders
Decision trees
Logistic regression
Naïve Bayes
Explanation - Autoencoders are neural networks that learn compressed representations of data in an unsupervised way.
Correct answer is: Autoencoders

Q.25 Cluster analysis can be applied in:

Gene expression data analysis
Weather forecasting
Stock price prediction
House price regression
Explanation - Clustering is widely used in bioinformatics to group genes with similar expression patterns.
Correct answer is: Gene expression data analysis