Q.1 Which of the following is a supervised learning classification technique?
K-Means
Decision Tree
Apriori
DBSCAN
Explanation - Decision Tree is a supervised learning algorithm used for classification and regression tasks.
Correct answer is: Decision Tree
Q.2 In Naive Bayes classifier, what assumption is made about features?
Features are dependent
Features are independent
Features are continuous
Features are irrelevant
Explanation - Naive Bayes assumes that all features are independent given the class label.
Correct answer is: Features are independent
Q.3 Which metric is commonly used to evaluate classification models?
Silhouette Score
Accuracy
Sum of Squares
Mean Squared Error
Explanation - Accuracy measures the ratio of correctly predicted instances to the total instances, commonly used for classification evaluation.
Correct answer is: Accuracy
Q.4 Which of the following is a tree-based ensemble method for classification?
KNN
Random Forest
Linear Regression
K-Means
Explanation - Random Forest is an ensemble method that constructs multiple decision trees and outputs the mode of their predictions.
Correct answer is: Random Forest
Q.5 Support Vector Machines (SVM) aim to find what in classification?
Cluster centers
Decision boundary maximizing margin
Frequent patterns
Regression line
Explanation - SVM finds the hyperplane that maximizes the margin between different class labels in the feature space.
Correct answer is: Decision boundary maximizing margin
Q.6 Which of these is a distance-based classifier?
K-Nearest Neighbors
Decision Tree
Naive Bayes
Logistic Regression
Explanation - KNN classifies instances based on the majority class of their nearest neighbors using a distance metric.
Correct answer is: K-Nearest Neighbors
Q.7 Logistic Regression is used for:
Predicting continuous values
Predicting probabilities of classes
Clustering data
Finding association rules
Explanation - Logistic Regression models the probability that a given input belongs to a particular class using the logistic function.
Correct answer is: Predicting probabilities of classes
Q.8 Which criterion is commonly used for splitting in a decision tree?
Entropy
Euclidean distance
Support
Lift
Explanation - Entropy measures the impurity of a set of examples; decision trees use it to decide the best split.
Correct answer is: Entropy
Q.9 Which of the following is NOT a classification algorithm?
Naive Bayes
KNN
Apriori
Random Forest
Explanation - Apriori is an algorithm for association rule mining, not classification.
Correct answer is: Apriori
Q.10 What does ROC curve plot?
Precision vs Recall
True Positive Rate vs False Positive Rate
Support vs Confidence
Error rate vs Iterations
Explanation - The ROC curve plots the trade-off between true positive rate and false positive rate at various threshold settings.
Correct answer is: True Positive Rate vs False Positive Rate
Q.11 Which of the following handles multi-class classification inherently?
Decision Tree
Binary SVM
Naive Bayes
K-Means
Explanation - Decision Trees can naturally handle multiple class labels without needing decomposition.
Correct answer is: Decision Tree
Q.12 In KNN, increasing the value of k generally:
Increases overfitting
Increases smoothness of decision boundary
Decreases bias
Has no effect
Explanation - A higher k value averages over more neighbors, making the decision boundary smoother and reducing variance.
Correct answer is: Increases smoothness of decision boundary
Q.13 Which of the following can handle non-linear classification boundaries?
Linear Regression
Decision Tree
Naive Bayes
Apriori
Explanation - Decision Trees can model non-linear boundaries by partitioning the feature space recursively.
Correct answer is: Decision Tree
Q.14 Which evaluation metric is most suitable for imbalanced classification datasets?
Accuracy
Precision, Recall, F1-score
Mean Squared Error
Sum of Squared Errors
Explanation - Accuracy can be misleading in imbalanced datasets; precision, recall, and F1-score provide a better evaluation.
Correct answer is: Precision, Recall, F1-score
Q.15 Which of these classifiers is probabilistic in nature?
Naive Bayes
KNN
Decision Tree
SVM
Explanation - Naive Bayes calculates the posterior probability of a class given input features using Bayes' theorem.
Correct answer is: Naive Bayes
Q.16 Pruning in decision trees is used to:
Reduce tree size and prevent overfitting
Increase depth of tree
Split nodes further
Improve training accuracy only
Explanation - Pruning removes branches that have little importance to reduce overfitting and improve generalization.
Correct answer is: Reduce tree size and prevent overfitting
Q.17 Which kernel is commonly used in SVM for non-linear data?
Linear kernel
Polynomial kernel
Gaussian/RBF kernel
Sigmoid kernel
Explanation - RBF kernel maps input data into higher-dimensional space allowing SVM to handle non-linear boundaries.
Correct answer is: Gaussian/RBF kernel
Q.18 In classification, a confusion matrix shows:
True positives, true negatives, false positives, false negatives
Clusters of data
Association rules
Regression coefficients
Explanation - A confusion matrix summarizes the counts of correct and incorrect predictions for each class.
Correct answer is: True positives, true negatives, false positives, false negatives
Q.19 Which of the following is sensitive to irrelevant features?
KNN
Decision Tree
Naive Bayes
Random Forest
Explanation - KNN relies on distance calculations; irrelevant features can distort distances and reduce accuracy.
Correct answer is: KNN
Q.20 Which method is used to combine multiple classifiers for better performance?
Bagging and Boosting
Apriori mining
K-Means clustering
PCA
Explanation - Ensemble methods like Bagging and Boosting combine multiple classifiers to improve accuracy and reduce variance.
Correct answer is: Bagging and Boosting
Q.21 Which algorithm is suitable for online (incremental) classification?
Naive Bayes
KNN
Decision Tree
Stochastic Gradient Descent classifier
Explanation - SGD classifier updates model parameters incrementally with each training sample, suitable for online learning.
Correct answer is: Stochastic Gradient Descent classifier
Q.22 Which of the following is a limitation of Naive Bayes?
Cannot handle categorical data
Strong independence assumption
High computational cost
Cannot handle small datasets
Explanation - Naive Bayes assumes features are independent, which may not hold in real-world data, limiting performance.
Correct answer is: Strong independence assumption
Q.23 Which of the following can be used for feature importance estimation in classification?
Decision Tree
Naive Bayes
K-Means
Apriori
Explanation - Decision Trees can rank features based on their contribution to reducing impurity, indicating feature importance.
Correct answer is: Decision Tree
Q.24 Which method is robust to outliers in classification?
Decision Tree
KNN
Naive Bayes
Linear Regression
Explanation - Decision Trees are robust to outliers since splits are based on feature thresholds, not distance metrics.
Correct answer is: Decision Tree
Q.25 Which approach converts multi-class problems into multiple binary classification problems?
One-vs-All (OvA)
KNN
Naive Bayes
Random Forest
Explanation - One-vs-All decomposes multi-class classification into several binary classifiers, each distinguishing one class from others.
Correct answer is: One-vs-All (OvA)
