Q.1 Which of the following is an example of unsupervised learning?
Spam email detection
Customer segmentation
House price prediction
Stock price forecasting
Explanation - Unsupervised learning is used when labeled data is not available. Customer segmentation groups customers based on patterns without predefined labels.
Correct answer is: Customer segmentation
Q.2 Which algorithm is commonly used for clustering in unsupervised learning?
Linear Regression
Decision Tree
K-Means
Naïve Bayes
Explanation - K-Means is a popular clustering algorithm in unsupervised learning that groups data into k clusters based on similarity.
Correct answer is: K-Means
Q.3 Principal Component Analysis (PCA) is primarily used for:
Classification
Regression
Dimensionality reduction
Clustering
Explanation - PCA reduces the number of dimensions while retaining most of the variance, making it a dimensionality reduction technique in unsupervised learning.
Correct answer is: Dimensionality reduction
Q.4 In K-Means clustering, 'K' stands for:
Number of features
Number of clusters
Number of iterations
Number of data points
Explanation - In K-Means, 'K' specifies the number of clusters into which the data will be divided.
Correct answer is: Number of clusters
Q.5 Which of the following is NOT an unsupervised learning task?
Clustering
Dimensionality reduction
Classification
Association rule mining
Explanation - Classification requires labeled data and hence belongs to supervised learning, not unsupervised learning.
Correct answer is: Classification
Q.6 Hierarchical clustering produces:
Decision boundaries
A regression model
A dendrogram
Probability distributions
Explanation - Hierarchical clustering creates a tree-like diagram called a dendrogram to represent nested clusters.
Correct answer is: A dendrogram
Q.7 Association rule mining is often applied in:
Market basket analysis
Regression tasks
Speech recognition
Spam filtering
Explanation - Association rule mining identifies patterns such as frequently purchased item sets, commonly used in market basket analysis.
Correct answer is: Market basket analysis
Q.8 Which distance metric is commonly used in clustering?
Manhattan distance
Euclidean distance
Cosine similarity
All of the above
Explanation - Clustering can use various distance metrics including Euclidean, Manhattan, and Cosine similarity, depending on the data type.
Correct answer is: All of the above
Q.9 Unsupervised learning deals with:
Labeled data
Unlabeled data
Partially labeled data
Reinforcement signals
Explanation - Unsupervised learning identifies hidden patterns in data without using labels.
Correct answer is: Unlabeled data
Q.10 Which of these is a density-based clustering algorithm?
DBSCAN
K-Means
Linear Regression
Random Forest
Explanation - DBSCAN groups data points based on density, unlike K-Means which relies on centroids.
Correct answer is: DBSCAN
Q.11 Dimensionality reduction helps mainly in:
Reducing computational cost
Improving data visualization
Eliminating redundancy
All of the above
Explanation - Dimensionality reduction improves efficiency, visualization, and reduces redundancy, making analysis easier.
Correct answer is: All of the above
Q.12 Which technique can be used for anomaly detection in unsupervised learning?
K-Means clustering
Support Vector Machines
Naïve Bayes
Logistic Regression
Explanation - K-Means can be used to identify data points far from cluster centroids, helping in anomaly detection.
Correct answer is: K-Means clustering
Q.13 Self-organizing maps (SOMs) are mainly used for:
Classification
Clustering and visualization
Regression
Time series forecasting
Explanation - SOMs are neural networks that reduce dimensions and help in clustering and visualizing high-dimensional data.
Correct answer is: Clustering and visualization
Q.14 In PCA, the principal components are chosen to maximize:
Mean
Variance
Error
Distance
Explanation - Principal components maximize variance to capture the most information in fewer dimensions.
Correct answer is: Variance
Q.15 Which of these is an application of clustering?
Customer segmentation
Image compression
Document classification
All of the above
Explanation - Clustering is directly used in customer segmentation, while image compression and document classification often require supervised methods.
Correct answer is: Customer segmentation
Q.16 Which method reduces noise in data by grouping similar points?
Regression
Clustering
Classification
Reinforcement learning
Explanation - Clustering groups similar points together, which helps reduce noise and highlight patterns.
Correct answer is: Clustering
Q.17 Gaussian Mixture Models (GMMs) are based on:
Probability distributions
Decision trees
Neural networks
Support vectors
Explanation - GMM assumes data is generated from a mixture of Gaussian probability distributions.
Correct answer is: Probability distributions
Q.18 Which evaluation metric is commonly used for clustering?
Accuracy
Silhouette score
Precision
Recall
Explanation - Silhouette score evaluates how similar a point is to its cluster compared to other clusters, useful for clustering validation.
Correct answer is: Silhouette score
Q.19 Which unsupervised method is often used in recommendation systems?
Collaborative filtering
Logistic regression
Naïve Bayes
Linear regression
Explanation - Collaborative filtering is an unsupervised technique widely used in recommendation systems to group users and items.
Correct answer is: Collaborative filtering
Q.20 Which algorithm can handle non-linearly separable clusters?
K-Means
DBSCAN
Linear regression
Logistic regression
Explanation - DBSCAN can detect arbitrary-shaped clusters, unlike K-Means which assumes spherical clusters.
Correct answer is: DBSCAN
Q.21 Clustering is an example of:
Supervised learning
Unsupervised learning
Reinforcement learning
Semi-supervised learning
Explanation - Clustering does not require labeled data and is a key unsupervised learning task.
Correct answer is: Unsupervised learning
Q.22 The 'elbow method' is used to determine:
Number of iterations
Number of clusters
Number of features
Number of data points
Explanation - The elbow method helps find the optimal number of clusters by plotting variance against cluster count.
Correct answer is: Number of clusters
Q.23 Anomaly detection in network security often uses:
Classification models
Clustering algorithms
Linear regression
Decision trees
Explanation - Clustering is used to detect unusual patterns in network traffic that may indicate anomalies.
Correct answer is: Clustering algorithms
Q.24 Which unsupervised learning technique reduces dimensionality by learning latent features?
Autoencoders
Decision trees
Logistic regression
Naïve Bayes
Explanation - Autoencoders are neural networks that learn compressed representations of data in an unsupervised way.
Correct answer is: Autoencoders
Q.25 Cluster analysis can be applied in:
Gene expression data analysis
Weather forecasting
Stock price prediction
House price regression
Explanation - Clustering is widely used in bioinformatics to group genes with similar expression patterns.
Correct answer is: Gene expression data analysis
