Machine Learning in Bioinformatics # MCQs Practice set

Q.1 What is the primary objective of supervised learning in gene expression analysis?

To cluster similar genes together

To predict the expression level of a gene given a set of features

To reduce dimensionality of the data

To identify outliers in the dataset

Explanation - Supervised learning uses labeled data (e.g., expression levels) to train models that can predict outcomes for new, unseen data.

Correct answer is: To predict the expression level of a gene given a set of features

Q.2 Which evaluation metric is most appropriate for assessing a protein‑binding site classifier when the classes are highly imbalanced?

Accuracy

Precision

Recall

Area under the ROC curve (AUC)

Explanation - AUC evaluates the trade‑off between true positive and false positive rates across all thresholds, making it suitable for imbalanced datasets.

Correct answer is: Area under the ROC curve (AUC)

Q.3 In a convolutional neural network (CNN) applied to DNA sequences, what does a 1‑D convolution kernel learn?

Spatial patterns across 2D images

Temporal patterns in time series data

Motifs or local patterns along the nucleotide sequence

Global features of the whole sequence

Explanation - 1‑D convolutions slide a kernel along a single dimension, capturing local motifs within the linear sequence of nucleotides.

Correct answer is: Motifs or local patterns along the nucleotide sequence

Q.4 What is the main advantage of using a Random Forest over a single decision tree in predicting cancer subtypes?

It requires less computational power

It can handle missing values automatically

It reduces overfitting by averaging multiple trees

It always achieves 100% accuracy

Explanation - Random Forests aggregate predictions from many trees, smoothing out individual tree variance and improving generalization.

Correct answer is: It reduces overfitting by averaging multiple trees

Q.5 Which feature selection method removes irrelevant genes before feeding data into a machine learning model?

Principal Component Analysis (PCA)

Recursive Feature Elimination (RFE)

K‑Means clustering

t‑SNE

Explanation - RFE ranks features by importance and recursively discards the least important ones, effectively reducing dimensionality.

Correct answer is: Recursive Feature Elimination (RFE)

Q.6 Transfer learning in bioinformatics often starts with a model pre‑trained on which type of data?

Protein‑structure images

Genomic sequences from other species

Clinical lab measurements

Patient questionnaires

Explanation - Pre‑trained models on related species' genomes can transfer learned motifs, improving performance on limited human data.

Correct answer is: Genomic sequences from other species

Q.7 Which unsupervised learning technique is commonly used to identify sub‑populations of cells in single‑cell RNA‑seq data?

Linear Regression

K‑Means clustering

Logistic Regression

Support Vector Machine (SVM)

Explanation - K‑Means partitions cells into clusters based on expression similarity, revealing distinct cell types or states.

Correct answer is: K‑Means clustering

Q.8 In reinforcement learning applied to protein folding, the agent receives a reward when which event occurs?

The protein forms a disulfide bond

The protein achieves a lower free‑energy conformation

The protein folds faster than predicted

The protein contains more alpha‑helices

Explanation - Lower free energy indicates a more stable, biologically relevant fold, serving as a natural reward signal.

Correct answer is: The protein achieves a lower free‑energy conformation

Q.9 What does the term 'cross‑validation' mean in machine learning?

Testing a model on unseen data from a different domain

Training a model with random noise added

Splitting data into training and testing subsets multiple times

Using a single large dataset for training only

Explanation - Cross‑validation evaluates model robustness by repeatedly partitioning the data.

Correct answer is: Splitting data into training and testing subsets multiple times

Q.10 Why are one‑hot encodings commonly used for DNA sequence data before feeding them into a neural network?

They reduce the dimensionality of the data

They preserve the order of nucleotides

They convert categorical nucleotides into numeric vectors

They make the data sparse and thus faster to compute

Explanation - One‑hot encoding transforms each nucleotide into a binary vector, enabling the network to process categorical information numerically.

Correct answer is: They convert categorical nucleotides into numeric vectors

Q.11 Which of the following is NOT a typical step in preprocessing RNA‑seq data for machine learning?

Read alignment

Normalization

Feature extraction

Real‑time visualization

Explanation - While visualization aids interpretation, it is not a preprocessing step for model training.

Correct answer is: Real‑time visualization

Q.12 Dropout is a regularization technique that does what during training?

Adds random noise to the input data

Removes a random subset of neurons from the network each iteration

Normalizes the output of each layer

Permanently deletes underperforming weights

Explanation - Dropout randomly drops neurons to prevent over‑reliance on any particular feature.

Correct answer is: Removes a random subset of neurons from the network each iteration

Q.13 Which type of neural network is most suitable for modeling time‑dependent gene expression data?

Convolutional Neural Network (CNN)

Recurrent Neural Network (RNN)

Feed‑forward Neural Network

Autoencoder

Explanation - RNNs capture sequential dependencies, making them ideal for time‑series gene expression profiles.

Correct answer is: Recurrent Neural Network (RNN)

Q.14 What is the main goal of using an autoencoder on high‑dimensional gene expression data?

To classify cell types

To reduce dimensionality while preserving key patterns

To generate synthetic genes

To identify mutations

Explanation - Autoencoders learn compressed representations that retain important variance in the data.

Correct answer is: To reduce dimensionality while preserving key patterns

Q.15 In a supervised classification task, the F1‑score is a combination of which two metrics?

Accuracy and specificity

Precision and recall

Sensitivity and specificity

Precision and accuracy

Explanation - F1 is the harmonic mean of precision and recall, balancing false positives and false negatives.

Correct answer is: Precision and recall

Q.16 Which of these is an example of a structured output prediction problem in bioinformatics?

Predicting the presence of a disease

Predicting the secondary structure of a protein

Clustering patient records

Estimating the age of a patient

Explanation - Secondary structure prediction involves outputting a sequence of labels (e.g., helix, sheet) aligned with the input sequence.

Correct answer is: Predicting the secondary structure of a protein

Q.17 What role does the 'kernel trick' play in Support Vector Machines?

It reduces the number of training samples

It transforms data into a higher‑dimensional space to make it linearly separable

It normalizes the input features

It speeds up the training process

Explanation - The kernel trick allows SVMs to operate in implicit high‑dimensional feature spaces without explicit mapping.

Correct answer is: It transforms data into a higher‑dimensional space to make it linearly separable

Q.18 Which performance metric is most informative when a model’s false positives are more costly than false negatives in a disease‑diagnosis scenario?

Sensitivity

Specificity

Accuracy

ROC AUC

Explanation - Specificity measures the proportion of true negatives, reducing the rate of false positives.

Correct answer is: Specificity

Q.19 Why is batch normalization used in deep learning models for bioinformatics?

To normalize gene expression levels across samples

To reduce internal covariate shift and stabilize training

To remove batch effects from the data

To increase the size of the training dataset

Explanation - Batch normalization standardizes activations, speeding up convergence and improving robustness.

Correct answer is: To reduce internal covariate shift and stabilize training

Q.20 Which of the following best describes 'transfer learning' in the context of predicting protein‑protein interactions?

Training a model from scratch using only human data

Using a pre‑trained model on a related task and fine‑tuning it on protein‑protein data

Transferring data from one species to another without any adaptation

Applying genetic algorithms to optimize the model

Explanation - Transfer learning leverages knowledge from a source task to improve performance on a target task.

Correct answer is: Using a pre‑trained model on a related task and fine‑tuning it on protein‑protein data

Q.21 In a k‑NN classifier, how does increasing 'k' affect the model's bias‑variance trade‑off?

Increases bias and decreases variance

Decreases bias and increases variance

Has no effect on bias or variance

Increases both bias and variance

Explanation - A larger k smooths predictions, reducing variance but increasing bias.

Correct answer is: Increases bias and decreases variance

Q.22 What is a key difference between a one‑class SVM and a binary SVM?

One‑class SVM can only handle continuous data

One‑class SVM is used for anomaly detection

Binary SVM requires labeled data while one‑class does not

Both statements are true

Explanation - One‑class SVM learns a boundary around normal data, flagging outliers as anomalies.

Correct answer is: One‑class SVM is used for anomaly detection

Q.23 Which type of deep learning architecture is commonly applied to predict the 3D structure of proteins from their amino‑acid sequence?

Recurrent Neural Network (RNN)

Convolutional Neural Network (CNN)

Graph Neural Network (GNN)

Feed‑forward Neural Network

Explanation - GNNs naturally model residues as nodes and interactions as edges, capturing spatial relationships.

Correct answer is: Graph Neural Network (GNN)

Q.24 Which technique is used to handle class imbalance in training a machine learning model for rare genetic disease prediction?

Undersampling the majority class

Oversampling the minority class using SMOTE

Adding noise to the minority class

All of the above

Explanation - Various resampling strategies, including SMOTE, help balance the dataset for better learning.

Correct answer is: All of the above

Q.25 In feature extraction for DNA microarrays, what is the purpose of using a sliding window approach?

To generate overlapping sequences for training

To increase the number of samples by duplication

To normalize probe intensities

To filter out low‑quality probes

Explanation - Sliding windows create local subsequences, enabling models to capture regional motifs.

Correct answer is: To generate overlapping sequences for training

Q.26 Which loss function is commonly used for multi‑class classification in deep learning?

Mean Squared Error (MSE)

Cross‑Entropy Loss

Huber Loss

Poisson Loss

Explanation - Cross‑entropy measures the difference between predicted probabilities and true one‑hot labels.

Correct answer is: Cross‑Entropy Loss

Q.27 In a genetic algorithm used to optimize a neural network architecture, which component mimics biological evolution?

Selection

Mutation

Crossover

All of the above

Explanation - These operators simulate survival of the fittest, random mutation, and recombination.

Correct answer is: All of the above

Q.28 What is the main reason to use t‑SNE for visualizing high‑dimensional bioinformatics data?

It preserves global distances accurately

It is computationally efficient for very large datasets

It preserves local structure, revealing clusters

It reduces dimensionality to one dimension only

Explanation - t‑SNE focuses on preserving local neighborhoods, making it great for cluster visualization.

Correct answer is: It preserves local structure, revealing clusters

Q.29 Which of these is an example of a supervised learning algorithm?

K‑Means clustering

Principal Component Analysis (PCA)

Support Vector Machine (SVM) for classification

Hierarchical clustering

Explanation - SVM uses labeled data to find a decision boundary, making it supervised.

Correct answer is: Support Vector Machine (SVM) for classification

Q.30 During the training of a recurrent neural network on mRNA secondary structure prediction, why is teacher forcing applied?

To accelerate convergence by using the true previous output during training

To enforce weight sharing across time steps

To prevent overfitting by dropout

To regularize the loss function

Explanation - Teacher forcing feeds the correct output at each step, speeding up training of sequential models.

Correct answer is: To accelerate convergence by using the true previous output during training

Q.31 Which type of data augmentation is particularly useful when training CNNs on protein surface images?

Horizontal flipping

Rotations around multiple axes

Adding Gaussian noise to pixel values

Color jittering

Explanation - Protein surfaces are 3‑D, so rotations preserve structural validity while increasing dataset diversity.

Correct answer is: Rotations around multiple axes

Q.32 Which of the following metrics best captures the trade‑off between sensitivity and specificity in a binary classifier?

F1‑Score

Matthews Correlation Coefficient (MCC)

Precision

Recall

Explanation - MCC considers true/false positives and negatives, providing a balanced measure even with class imbalance.

Correct answer is: Matthews Correlation Coefficient (MCC)

Q.33 In the context of next‑generation sequencing (NGS) data, what does a 'coverage depth' of 30× mean?

Each base is read 30 times on average

30% of the genome is covered by reads

Reads span 30 base‑pairs on average

The data set contains 30 million reads

Explanation - Coverage depth refers to the average number of times each nucleotide position is sequenced.

Correct answer is: Each base is read 30 times on average

Q.34 What is the purpose of using a confusion matrix in evaluating a disease‑prediction model?

To calculate the ROC curve

To summarize prediction outcomes across all classes

To measure the model’s loss

To normalize the feature space

Explanation - A confusion matrix shows true/false positives/negatives, providing insight into specific error types.

Correct answer is: To summarize prediction outcomes across all classes

Q.35 Which regularization technique adds a penalty equal to the absolute value of weights to the loss function?

L1 regularization

L2 regularization

Elastic Net

Dropout

Explanation - L1 regularization encourages sparsity by penalizing the absolute magnitude of weights.

Correct answer is: L1 regularization

Q.36 Which of these is NOT an example of a supervised learning model commonly used in bioinformatics?

Decision Tree

Random Forest

k‑Nearest Neighbors

Gaussian Mixture Model

Explanation - Gaussian Mixture Models are unsupervised, used for clustering rather than classification.

Correct answer is: Gaussian Mixture Model

Q.37 What does the 'softmax' function output in a neural network?

Binary predictions (0 or 1)

A probability distribution over multiple classes

Raw logits for regression

A single continuous value

Explanation - Softmax normalizes logits into probabilities that sum to one.

Correct answer is: A probability distribution over multiple classes

Q.38 When training a convolutional network for histopathology image classification, why might one use a pre‑trained ResNet as a feature extractor?

Because it eliminates the need for labeled data

Because it learns generic visual features that transfer to medical images

Because it guarantees perfect accuracy on medical data

Because it reduces the need for GPU resources

Explanation - ResNet's lower layers capture edges and textures common across images, aiding transfer learning.

Correct answer is: Because it learns generic visual features that transfer to medical images

Q.39 Which algorithm is specifically designed to handle data with high dimensionality and sparse features, such as gene expression datasets?

Naïve Bayes

Support Vector Machine (SVM) with linear kernel

k‑Nearest Neighbors

Decision Tree

Explanation - Linear SVMs efficiently process high‑dimensional sparse data by optimizing a hyperplane.

Correct answer is: Support Vector Machine (SVM) with linear kernel

Q.40 Which of the following best describes the 'bagging' strategy used in Random Forests?

Training a single large model on the entire dataset

Combining multiple weak learners trained on random subsets of data

Using bootstrapped datasets to reduce bias

Pruning trees to avoid overfitting

Explanation - Bagging averages predictions of many trees, each built on a bootstrap sample.

Correct answer is: Combining multiple weak learners trained on random subsets of data

Q.41 In a bioinformatics pipeline, what is the main purpose of 'variant calling' after DNA sequencing?

To identify differences between the sequenced genome and a reference genome

To assemble the genome from short reads

To filter out low‑quality reads

To predict gene function

Explanation - Variant calling detects SNPs, insertions, deletions, and other genetic alterations.

Correct answer is: To identify differences between the sequenced genome and a reference genome

Q.42 Which metric would you use to evaluate a regression model predicting drug concentrations?

Accuracy

Mean Absolute Error (MAE)

Area Under Curve (AUC)

F1‑Score

Explanation - MAE measures average absolute deviation between predicted and actual continuous values.

Correct answer is: Mean Absolute Error (MAE)

Q.43 What does the term 'overfitting' refer to in the context of machine learning?

When a model performs poorly on the training data

When a model learns noise and performs well only on training data

When a model generalizes too well

When a model has too few parameters

Explanation - Overfitting results in high training accuracy but low test performance due to memorizing noise.

Correct answer is: When a model learns noise and performs well only on training data

Q.44 Which type of neural network is best suited for processing data represented as graphs, such as protein‑protein interaction networks?

Convolutional Neural Network (CNN)

Graph Neural Network (GNN)

Recurrent Neural Network (RNN)

Fully Connected Network

Explanation - GNNs directly operate on graph structures, aggregating node information from neighbors.

Correct answer is: Graph Neural Network (GNN)

Q.45 Why is it important to split a dataset into training, validation, and test sets?

To ensure the model trains quickly

To evaluate the model’s ability to generalize to unseen data

To avoid using a GPU during training

To increase the number of training samples

Explanation - Separating data prevents over‑optimistic estimates of performance.

Correct answer is: To evaluate the model’s ability to generalize to unseen data

Q.46 Which of the following best describes a 'kernel' in machine learning?

A method of data augmentation

A function that measures similarity between data points

A type of loss function

A regularization technique

Explanation - Kernel functions compute dot products in high‑dimensional feature spaces.

Correct answer is: A function that measures similarity between data points

Q.47 What is the main benefit of using a hierarchical clustering approach for phylogenetic tree construction?

It is the fastest clustering method available

It produces a tree that reflects evolutionary relationships

It requires labeled data

It always yields 100% accurate trees

Explanation - Hierarchical clustering groups sequences based on pairwise distances, forming a tree structure.

Correct answer is: It produces a tree that reflects evolutionary relationships

Q.48 In the context of deep learning, what is the primary role of an activation function?

To reduce the dimensionality of the input

To introduce non‑linearity into the model

To compute the loss value

To perform back‑propagation

Explanation - Activation functions allow networks to learn complex patterns beyond linear combinations.

Correct answer is: To introduce non‑linearity into the model

Q.49 Which of these is a common approach for handling missing values in gene expression data?

Delete all samples with missing values

Impute missing values using the mean of the gene expression

Replace missing values with zero

Ignore the missing values during training

Explanation - Mean imputation preserves overall data structure while handling missing entries.

Correct answer is: Impute missing values using the mean of the gene expression

Q.50 What does 'batch size' refer to in neural network training?

The number of epochs to train

The number of samples processed before updating weights

The total size of the training dataset

The size of the hidden layer

Explanation - Batch size determines how many samples are used per gradient update.

Correct answer is: The number of samples processed before updating weights

Q.51 Which technique is used to prevent a neural network from learning noise in the data during training?

Overfitting

Regularization

Bootstrapping

Cross‑validation

Explanation - Regularization adds constraints to the model to avoid overfitting to noise.

Correct answer is: Regularization

Q.52 In bioinformatics, what is a 'blast' algorithm typically used for?

To predict protein folding

To align nucleotide or protein sequences against a database

To design primers for PCR

To simulate cellular pathways

Explanation - BLAST searches a query sequence against a database to find similar sequences.

Correct answer is: To align nucleotide or protein sequences against a database

Q.53 Which of the following is a characteristic of an unsupervised learning algorithm?

It requires labeled target values

It only works on binary classification

It discovers patterns without explicit labels

It always outputs a regression value

Explanation - Unsupervised learning finds structure in unlabeled data.

Correct answer is: It discovers patterns without explicit labels

Q.54 What is the purpose of 'one‑hot encoding' in processing categorical genomic data?

To compress the data

To convert categories into binary vectors

To reduce noise

To increase the dataset size

Explanation - One‑hot encoding represents each category as a separate binary feature, enabling numeric processing.

Correct answer is: To convert categories into binary vectors

Q.55 Which of the following best describes 'feature importance' in tree‑based models?

The weight assigned to each input node

The contribution of each feature to model predictions

The number of times a feature is used in a tree

All of the above

Explanation - Feature importance measures how much a feature influences the model’s decisions.

Correct answer is: All of the above

Q.56 Why is it important to use a 'validation set' during hyperparameter tuning?

To estimate how well the model will perform on unseen data

To reduce the size of the training dataset

To calculate the final test accuracy

To determine the number of epochs

Explanation - Validation data provides a realistic assessment of hyperparameter choices without biasing the test set.

Correct answer is: To estimate how well the model will perform on unseen data

Q.57 What is the role of the 'learning rate' in gradient‑based optimization?

It determines how often the model is evaluated

It controls the step size in updating weights

It sets the maximum number of epochs

It decides how many layers are added

Explanation - A higher learning rate may converge faster but risks overshooting, while a lower rate is more stable.

Correct answer is: It controls the step size in updating weights

Q.58 Which deep learning architecture is designed to preserve spatial hierarchies while reducing parameter count?

DenseNet

ResNet

AlexNet

VGG

Explanation - DenseNet connects each layer to every other layer, enabling efficient feature reuse and fewer parameters.

Correct answer is: DenseNet

Q.59 In a classification model, what does a confusion matrix entry of 'False Positive' represent?

A sample correctly classified as negative

A sample incorrectly classified as positive

A sample correctly classified as positive

A sample incorrectly classified as negative

Explanation - A false positive occurs when the model predicts the positive class for an actual negative sample.

Correct answer is: A sample incorrectly classified as positive

Q.60 Which method is commonly used to address high dimensionality in microarray data before classification?

Normalization

Feature selection

Data augmentation

Hyperparameter optimization

Explanation - Feature selection reduces the number of genes considered, improving model performance.

Correct answer is: Feature selection

Q.61 What does 'dropout' accomplish during the training of a neural network?

It adds noise to the inputs

It randomly drops hidden units to prevent co‑adaptation

It ensures the model uses all data points

It reduces the size of the output layer

Explanation - Dropout forces the network to learn redundant representations, reducing overfitting.

Correct answer is: It randomly drops hidden units to prevent co‑adaptation

Q.62 Which algorithm is most suitable for performing dimensionality reduction while preserving non‑linear relationships?

Principal Component Analysis (PCA)

t‑SNE

Linear Discriminant Analysis (LDA)

K‑Means clustering

Explanation - t‑SNE captures complex, non‑linear structures in high‑dimensional data.

Correct answer is: t‑SNE

Q.63 In a deep learning model for protein‑binding site detection, which layer type would be most effective for capturing local interaction patterns?

Fully connected layer

Convolutional layer

Recurrent layer

Pooling layer

Explanation - Convolutional layers slide filters over the input, detecting localized motifs relevant for binding.

Correct answer is: Convolutional layer

Q.64 What is a 'hyperparameter' in the context of training a neural network?

A parameter learned during training

A fixed value set before training

A weight that updates during back‑propagation

A regularization coefficient

Explanation - Hyperparameters (e.g., learning rate, batch size) are set externally and not updated during training.

Correct answer is: A fixed value set before training

Q.65 Which evaluation metric is most appropriate when the number of negative instances greatly exceeds positives in a disease detection scenario?

Accuracy

Precision

Recall

AUC‑PR (Precision‑Recall curve)

Explanation - AUC‑PR focuses on precision–recall trade‑off, especially useful for imbalanced data.

Correct answer is: AUC‑PR (Precision‑Recall curve)

Q.66 What is the purpose of a 'learning schedule' in neural network training?

To adjust the learning rate over epochs

To determine the number of layers

To fix the model architecture

To split data into training and test sets

Explanation - Learning schedules reduce the learning rate during training to improve convergence.

Correct answer is: To adjust the learning rate over epochs

Q.67 Which type of neural network is best suited for modeling sequences such as DNA or RNA?

Convolutional Neural Network (CNN)

Recurrent Neural Network (RNN)

Autoencoder

Decision Tree

Explanation - RNNs capture dependencies across sequence positions, essential for biological sequences.

Correct answer is: Recurrent Neural Network (RNN)

Q.68 Which of the following is a key advantage of using a Support Vector Machine (SVM) for classifying genomic data?

It does not require any feature scaling

It can handle high‑dimensional data efficiently

It always achieves 100% accuracy

It does not need labeled data

Explanation - SVMs find a separating hyperplane in high‑dimensional spaces, suitable for genomic features.

Correct answer is: It can handle high‑dimensional data efficiently

Q.69 What does the 'softmax' function do in a classification neural network?

It normalizes logits into a probability distribution

It applies a non‑linear activation to hidden layers

It computes the gradient for back‑propagation

It reduces dimensionality of the input

Explanation - Softmax transforms raw scores into probabilities that sum to one.

Correct answer is: It normalizes logits into a probability distribution

Q.70 Which of the following is an example of an unsupervised learning technique used in bioinformatics?

Support Vector Machine (SVM)

K‑Means clustering

Logistic Regression

Gradient Boosting

Explanation - K‑Means groups data points into clusters without using labels, a common unsupervised approach.

Correct answer is: K‑Means clustering

Q.71 In a genetic algorithm, what is the role of 'crossover'?

To create new individuals by combining parts of two parents

To mutate random bits in an individual

To evaluate fitness of an individual

To select individuals for reproduction

Explanation - Crossover exchanges genetic material between parents, producing new offspring.

Correct answer is: To create new individuals by combining parts of two parents

Q.72 Which of the following best describes the 'curse of dimensionality'?

High dimensional data can be processed faster

Distance metrics become less informative as dimensions increase

It is a problem only in low‑dimensional data

It only affects neural networks

Explanation - In high dimensions, points tend to be equidistant, making nearest‑neighbor methods ineffective.

Correct answer is: Distance metrics become less informative as dimensions increase

Q.73 Which machine learning algorithm is specifically designed to model time‑series data with memory of previous states?

Random Forest

Support Vector Machine

Recurrent Neural Network (RNN)

k‑Nearest Neighbors

Explanation - RNNs maintain hidden states that capture temporal dependencies.

Correct answer is: Recurrent Neural Network (RNN)

Q.74 Which evaluation metric is commonly used to measure how well a clustering algorithm has grouped similar items together?

Silhouette Score

Accuracy

Precision

Recall

Explanation - The silhouette score evaluates cohesion and separation of clusters.

Correct answer is: Silhouette Score

Q.75 What is the main purpose of 'cross‑entropy loss' in a classification problem?

To compute the difference between predicted and true values in regression

To penalize incorrect probabilistic predictions

To enforce sparsity in weights

To measure similarity between sequences

Explanation - Cross‑entropy loss quantifies how far predicted probabilities diverge from true labels.

Correct answer is: To penalize incorrect probabilistic predictions

Q.76 Which of the following is a key advantage of using an autoencoder for gene expression data?

It always achieves perfect reconstruction

It reduces dimensionality while preserving key patterns

It eliminates the need for labeled data

It requires no computational resources

Explanation - Autoencoders learn a compressed representation that captures essential structure.

Correct answer is: It reduces dimensionality while preserving key patterns

Q.77 In machine learning pipelines, what is 'feature scaling' and why is it important?

It normalizes input values to a similar range, improving optimization convergence

It selects the most important features automatically

It increases the dimensionality of the data

It converts categorical data into numeric data

Explanation - Feature scaling prevents attributes with large ranges from dominating the learning process.

Correct answer is: It normalizes input values to a similar range, improving optimization convergence

Q.78 Which of these neural network architectures is best suited for capturing long‑range dependencies in genomic sequences?

Feed‑forward network

Convolutional neural network (CNN)

Long Short‑Term Memory (LSTM) network

Autoencoder

Explanation - LSTMs maintain memory over long sequences, ideal for genomic data with distant motifs.

Correct answer is: Long Short‑Term Memory (LSTM) network

Q.79 What does 'epoch' refer to in the context of training a neural network?

The number of layers in the network

The number of times the entire training dataset is passed through the network

The learning rate schedule

The number of neurons in the output layer

Explanation - An epoch is one full pass of all training samples during learning.

Correct answer is: The number of times the entire training dataset is passed through the network

Q.80 Which technique is used to evaluate the performance of a regression model predicting drug concentrations?

Mean Absolute Error (MAE)

Accuracy

Precision

Recall

Explanation - MAE measures the average absolute difference between predicted and actual values.

Correct answer is: Mean Absolute Error (MAE)

Q.81 In the context of next‑generation sequencing, what does a 'read alignment' step accomplish?

It assembles the reads into a complete genome

It matches sequencing reads to a reference genome

It removes duplicate reads

It predicts gene expression levels

Explanation - Read alignment aligns short sequence reads to a reference for variant calling.

Correct answer is: It matches sequencing reads to a reference genome

Q.82 Which of the following is a common challenge when working with RNA‑seq data for machine learning?

The data is too small for training

The data has high dimensionality and sparse counts

The data is always balanced

The data requires no preprocessing

Explanation - RNA‑seq generates thousands of genes with many zeros, complicating learning.

Correct answer is: The data has high dimensionality and sparse counts

Q.83 What does 'k‑fold cross‑validation' involve?

Training the model k times on the entire dataset

Dividing the data into k subsets and rotating the test set

Using k as the number of hidden layers

Scaling features by a factor of k

Explanation - k‑fold CV partitions data into k folds, each used once for testing.

Correct answer is: Dividing the data into k subsets and rotating the test set

Q.84 Which metric is most appropriate when evaluating a model that predicts a continuous value in a bioinformatics context?

Accuracy

Precision

Mean Squared Error (MSE)

Recall

Explanation - MSE is a standard loss for regression tasks measuring squared difference.

Correct answer is: Mean Squared Error (MSE)

Q.85 Which of the following best describes a 'hyperplane' in the context of Support Vector Machines?

A two‑dimensional plane separating classes

A line connecting two support vectors

A decision boundary that maximizes margin between classes

A set of weights used in training

Explanation - An SVM hyperplane separates classes with the largest possible margin.

Correct answer is: A decision boundary that maximizes margin between classes

Q.86 In a convolutional neural network, what is the purpose of the 'pooling' layer?

To increase the number of features

To reduce spatial dimensions and computation

To initialize weights

To compute the loss function

Explanation - Pooling down‑samples feature maps, reducing parameters and overfitting risk.

Correct answer is: To reduce spatial dimensions and computation

Q.87 Which of the following is a key advantage of using a ReLU activation function in neural networks?

It is non‑linear and helps with vanishing gradients

It always outputs values between 0 and 1

It reduces the number of parameters

It performs well with small datasets

Explanation - ReLU introduces non‑linearity and mitigates vanishing gradients for deep nets.

Correct answer is: It is non‑linear and helps with vanishing gradients

Q.88 What is the primary purpose of a 'loss function' in machine learning?

To compute the gradient for back‑propagation

To measure the difference between predicted and true values

To select the best hyperparameters

To reduce overfitting

Explanation - The loss function quantifies prediction error, guiding weight updates.

Correct answer is: To measure the difference between predicted and true values

Q.89 Which of the following best describes a 'kernel trick' in SVM?

Computing distances in the input space

Transforming data into a higher‑dimensional space via a kernel function

Using a different loss function

Reducing dimensionality with PCA

Explanation - The kernel trick allows SVMs to operate implicitly in high‑dimensional feature spaces.

Correct answer is: Transforming data into a higher‑dimensional space via a kernel function

Q.90 Which type of neural network is specifically designed to capture spatial hierarchies in data such as images or 1‑D sequences?

Recurrent Neural Network (RNN)

Convolutional Neural Network (CNN)

Autoencoder

Feed‑forward Neural Network

Explanation - CNNs use convolutional layers to learn hierarchical spatial features.

Correct answer is: Convolutional Neural Network (CNN)

Q.91 Which of the following metrics is most appropriate for evaluating the performance of a model that detects rare genetic variants?

Accuracy

Recall

Precision

Matthews Correlation Coefficient (MCC)

Explanation - MCC provides a balanced measure even with extreme class imbalance.

Correct answer is: Matthews Correlation Coefficient (MCC)

Q.92 What does the 'bias term' do in a neural network layer?

It shifts the activation function output to center around zero

It regularizes the model to prevent overfitting

It increases the number of neurons

It is added to the loss function

Explanation - The bias term adjusts the output of a neuron independently of its inputs.

Correct answer is: It shifts the activation function output to center around zero

Q.93 Which technique is commonly used to prevent overfitting in deep learning models?

Batch normalization

Data augmentation

Dropout

All of the above

Explanation - All listed methods help reduce overfitting by regularizing or augmenting data.

Correct answer is: All of the above

Q.94 In a classification problem with many classes, which metric would you use to compare the performance of different models?

Overall accuracy

Macro‑averaged F1‑score

Micro‑averaged F1‑score

All of the above

Explanation - Macro‑averaging gives equal weight to all classes, useful when classes are imbalanced.

Correct answer is: Macro‑averaged F1‑score

Q.95 What does 'one‑class SVM' primarily aim to detect in bioinformatics applications?

Normal samples and flag anomalies

The most frequent class in the data

All classes equally

The class with the largest margin

Explanation - One‑class SVM learns a boundary around normal data, labeling outliers as anomalies.

Correct answer is: Normal samples and flag anomalies

Q.96 Which of the following is NOT a common preprocessing step for RNA‑seq data before machine learning?

Normalization

Batch effect correction

Feature scaling

Data duplication

Explanation - Data duplication is generally avoided as it can introduce bias.

Correct answer is: Data duplication

Q.97 What is the purpose of a 'feature importance score' in a Random Forest model?

To measure how frequently a feature is used in the model

To evaluate the model’s accuracy

To determine the optimal learning rate

To calculate the loss function

Explanation - Feature importance indicates the influence of each feature on predictions.

Correct answer is: To measure how frequently a feature is used in the model

Q.98 In the context of bioinformatics, what does the term 'motif' refer to?

A short, recurring pattern in DNA or protein sequences

A computational algorithm for alignment

A type of neural network

A statistical test for significance

Explanation - Motifs are conserved sequence patterns associated with functional or structural roles.

Correct answer is: A short, recurring pattern in DNA or protein sequences

Q.99 Which of the following best describes the 'dropout rate' in a neural network?

The learning rate used during training

The proportion of neurons randomly dropped during each training step

The fraction of data used for validation

The number of epochs to train

Explanation - Dropout rate specifies how many neurons are temporarily disabled to prevent overfitting.

Correct answer is: The proportion of neurons randomly dropped during each training step

Q.100 What is the primary difference between a 'CNN' and a 'RNN' for sequence data?

CNNs capture local patterns, RNNs capture sequential dependencies

CNNs require more parameters than RNNs

RNNs can only handle images, CNNs only handle text

CNNs are unsupervised, RNNs are supervised

Explanation - CNNs slide filters over the sequence, whereas RNNs maintain state across positions.

Correct answer is: CNNs capture local patterns, RNNs capture sequential dependencies

Q.101 Which evaluation metric is best suited to assess a model that predicts whether a variant is pathogenic or benign?

Accuracy

Recall

Precision

Area under the Precision‑Recall curve (AUC‑PR)

Explanation - AUC‑PR focuses on the trade‑off between precision and recall, important for imbalanced variant data.

Correct answer is: Area under the Precision‑Recall curve (AUC‑PR)

Q.102 In a neural network, what is the role of the 'softmax' layer?

To compute probabilities for multi‑class classification

To perform dimensionality reduction

To normalize inputs

To calculate gradient updates

Explanation - Softmax converts raw logits into a probability distribution over classes.

Correct answer is: To compute probabilities for multi‑class classification

Q.103 Which of the following is a common method for reducing the dimensionality of gene expression data before classification?

Feature selection

PCA

Both A and B

None of the above

Explanation - Both feature selection and PCA are widely used to reduce dimensionality.

Correct answer is: Both A and B

Q.104 What is the purpose of the 'learning rate decay' schedule in deep learning?

To increase the learning rate over time

To gradually reduce the learning rate during training

To adjust the batch size dynamically

To change the number of epochs

Explanation - Learning rate decay helps refine convergence as training progresses.

Correct answer is: To gradually reduce the learning rate during training

Q.105 Which of the following describes a 'Gaussian Mixture Model (GMM)'?

A supervised classification model

An unsupervised clustering algorithm based on Gaussian distributions

A linear regression technique

A type of neural network

Explanation - GMM assumes data points are generated from a mixture of Gaussians.

Correct answer is: An unsupervised clustering algorithm based on Gaussian distributions

Q.106 Which of the following best describes the 'Adam' optimizer in training deep neural networks?

It uses only first‑order gradients

It adapts learning rates for each parameter based on first and second moments of gradients

It is equivalent to Stochastic Gradient Descent (SGD)

It requires no hyperparameters

Explanation - Adam computes adaptive learning rates using moving averages of gradients and squared gradients.

Correct answer is: It adapts learning rates for each parameter based on first and second moments of gradients

Q.107 What is the main purpose of using a 'validation set' during model development?

To evaluate the model’s performance on unseen data and tune hyperparameters

To compute the final test accuracy

To increase the size of the training dataset

To replace the training set

Explanation - The validation set guides hyperparameter selection without leaking test information.

Correct answer is: To evaluate the model’s performance on unseen data and tune hyperparameters

Q.108 Which of the following best describes a 'deep learning' model?

A model with more than two hidden layers

A model that uses shallow, single‑layer networks

A model that does not require labeled data

A model that only works with images

Explanation - Deep learning refers to neural networks with many hierarchical layers.

Correct answer is: A model with more than two hidden layers

Q.109 In a classification problem, which metric gives equal importance to both classes in an imbalanced dataset?

Accuracy

Precision

Recall

Balanced Accuracy

Explanation - Balanced accuracy averages recall across classes, mitigating class imbalance bias.

Correct answer is: Balanced Accuracy

Q.110 Which of the following is an example of a 'reinforcement learning' application in bioinformatics?

Predicting protein‑protein interactions

Training an agent to design stable protein folds

Classifying gene expression profiles

Identifying motifs in DNA sequences

Explanation - Reinforcement learning can optimize protein design by rewarding stable conformations.

Correct answer is: Training an agent to design stable protein folds

Q.111 What is the main challenge of applying deep learning to small bioinformatics datasets?

Model overfitting due to limited data

Inability to capture non‑linear relationships

Difficulty in visualizing results

Large memory requirements

Explanation - Small datasets increase overfitting risk, requiring regularization or data augmentation.

Correct answer is: Model overfitting due to limited data

Q.112 Which of the following best describes a 'one‑hot encoding' scheme for amino acid sequences?

Each amino acid is represented by a real number between 0 and 1

Each amino acid is encoded as a binary vector of length 20

All amino acids are assigned the same vector

It converts sequences into images

Explanation - One‑hot vectors represent each of the 20 standard amino acids uniquely.

Correct answer is: Each amino acid is encoded as a binary vector of length 20

Q.113 In a machine learning pipeline, what is 'normalization' most commonly applied to?

Label data

Input features

Output predictions

Training epochs

Explanation - Normalizing features ensures they are on comparable scales for efficient learning.

Correct answer is: Input features

Q.114 What does the 'batch size' determine in a training loop?

The number of epochs to train

The number of samples processed before the model updates weights

The learning rate schedule

The size of the training dataset

Explanation - Batch size controls how many data points are used to compute a single gradient update.

Correct answer is: The number of samples processed before the model updates weights

Q.115 Which metric is especially informative when dealing with highly imbalanced datasets in classification?

Accuracy

Precision

Recall

Matthews Correlation Coefficient (MCC)

Explanation - MCC takes all four confusion matrix categories into account, providing a balanced measure.

Correct answer is: Matthews Correlation Coefficient (MCC)

Q.116 Which of the following best describes an 'autoencoder' in machine learning?

A supervised classifier

An unsupervised model that learns a compressed representation of input data

A clustering algorithm

A reinforcement learning agent

Explanation - Autoencoders encode input data into a lower‑dimensional latent space and reconstruct it.

Correct answer is: An unsupervised model that learns a compressed representation of input data

Q.117 In the context of genomics, what does 'variant calling' involve?

Aligning sequencing reads to a reference genome

Identifying differences between the sample genome and the reference

Predicting gene expression levels

Normalizing read counts

Explanation - Variant calling detects SNPs, indels, and other genomic alterations.

Correct answer is: Identifying differences between the sample genome and the reference

Q.118 Which of the following is a key benefit of using a 'pre‑trained language model' for predicting RNA‑secondary structure?

It requires no training data

It captures sequence context learned from vast RNA corpora

It always achieves perfect accuracy

It eliminates the need for feature engineering

Explanation - Pre‑trained language models encode rich contextual information that can be fine‑tuned for specific tasks.

Correct answer is: It captures sequence context learned from vast RNA corpora

Q.119 Which of the following is a typical preprocessing step before feeding RNA‑seq counts into a machine learning model?

Log transformation

One‑hot encoding

Feature selection

All of the above

Explanation - Log transform stabilizes variance, one‑hot encoding handles categorical data, and feature selection reduces dimensionality.

Correct answer is: All of the above

Q.120 What does 'cross‑entropy' measure in a classification context?

The distance between two probability distributions

The mean squared error between predictions and labels

The variance of predictions

The correlation coefficient

Explanation - Cross‑entropy quantifies the dissimilarity between predicted probabilities and true labels.

Correct answer is: The distance between two probability distributions

Q.121 Which of the following best explains why 'dropout' can improve model generalization?

It reduces the number of parameters in the model

It encourages the network to learn redundant representations

It speeds up training by skipping computations

It prevents the model from learning any patterns

Explanation - Dropout forces neurons to be robust by not depending on any single feature.

Correct answer is: It encourages the network to learn redundant representations

Q.122 Which of the following is an advantage of using a 'graph neural network' in bioinformatics?

It can directly process relational data such as protein interaction networks

It always requires fewer training samples

It is only applicable to image data

It cannot handle large graphs

Explanation - GNNs operate on graph structures, making them ideal for relational biological data.

Correct answer is: It can directly process relational data such as protein interaction networks

Q.123 In a supervised learning task for predicting protein‑protein binding affinity, which of the following would be considered a 'label'?

The amino‑acid sequence of the protein

The 3D structure of the protein complex

A continuous value representing binding free energy

The number of genes expressed

Explanation - The label is the target variable the model is trained to predict.

Correct answer is: A continuous value representing binding free energy

Q.124 What is the primary function of an 'activation function' in a neural network layer?

To add noise to the input

To introduce non‑linearity into the network

To compute the loss value

To reduce dimensionality

Explanation - Activation functions enable neural networks to model complex patterns.

Correct answer is: To introduce non‑linearity into the network

Q.125 Which of the following metrics is most appropriate for a binary classification problem with a highly skewed class distribution?

Accuracy

Precision

Recall

Matthews Correlation Coefficient (MCC)

Explanation - MCC provides a balanced metric that accounts for all confusion matrix entries.

Correct answer is: Matthews Correlation Coefficient (MCC)

Q.126 What does 't‑SNE' primarily aim to preserve when reducing dimensionality?

Global pairwise distances

Local neighborhood relationships

Variance of the data

Data density in high dimensional space

Explanation - t‑SNE focuses on preserving local structure, producing meaningful clusters.

Correct answer is: Local neighborhood relationships

Q.127 Which of the following best describes a 'feature vector' in machine learning?

A single numeric value representing a sample

A list of numeric attributes describing a sample

A set of labels for training

A model architecture

Explanation - A feature vector encodes all relevant information about an instance for learning.

Correct answer is: A list of numeric attributes describing a sample

Q.128 In a supervised learning setting for disease risk prediction, why is it important to separate the data into training and testing sets?

To avoid overfitting and obtain a realistic estimate of model performance

To increase the size of the training data

To reduce the computational load

To remove irrelevant features

Explanation - Separating data ensures that performance metrics reflect generalization to new samples.

Correct answer is: To avoid overfitting and obtain a realistic estimate of model performance

Q.129 Which type of neural network is best suited for learning patterns from 3‑D volumetric data, such as cryo‑EM images?

1‑D CNN

2‑D CNN

3‑D CNN

Recurrent Neural Network (RNN)

Explanation - 3‑D CNNs can capture spatial relationships across three dimensions.

Correct answer is: 3‑D CNN

Q.130 What is a typical use of a 'latent space' in a generative model such as a VAE?

To store training labels

To compress input data into a lower‑dimensional representation for generation

To accelerate training speed

To evaluate model accuracy

Explanation - The latent space captures essential features that can be decoded into new samples.

Correct answer is: To compress input data into a lower‑dimensional representation for generation

Q.131 Which of the following best describes the 'Adam' optimizer's key feature?

It uses a fixed learning rate

It adapts the learning rate for each parameter based on gradient moments

It is equivalent to standard gradient descent

It requires no hyperparameters

Explanation - Adam computes adaptive learning rates using moving averages of gradients and their squares.

Correct answer is: It adapts the learning rate for each parameter based on gradient moments

Q.132 Which of the following is NOT typically used for dimensionality reduction in gene expression data?

PCA

t‑SNE

SMOTE

Autoencoder

Explanation - SMOTE is a resampling technique for handling class imbalance, not dimensionality reduction.

Correct answer is: SMOTE

Q.133 In a binary classification task, what does a high false‑negative rate indicate?

Many positive cases are incorrectly labeled as negative

Many negative cases are incorrectly labeled as positive

The model has high precision

The model has high accuracy

Explanation - False negatives occur when positives are missed, reducing recall.

Correct answer is: Many positive cases are incorrectly labeled as negative

Q.134 Which of the following best describes the concept of 'regularization' in machine learning?

Adding noise to the data

Adding a penalty term to the loss function to prevent overfitting

Increasing the number of epochs

Normalizing input features

Explanation - Regularization discourages overly complex models by penalizing large weights.

Correct answer is: Adding a penalty term to the loss function to prevent overfitting

Q.135 What is a 'latent variable' in a generative model?

An observable variable in the dataset

A hidden variable that explains observed data patterns

A feature that is always zero

A hyperparameter of the model

Explanation - Latent variables capture underlying structure that generates the observed data.

Correct answer is: A hidden variable that explains observed data patterns

Q.136 Which evaluation metric is specifically designed for regression tasks in bioinformatics?

Accuracy

F1‑Score

Root Mean Square Error (RMSE)

Recall

Explanation - RMSE measures the average magnitude of prediction errors in regression.

Correct answer is: Root Mean Square Error (RMSE)

Q.137 In a CNN used for predicting DNA binding sites, which layer is responsible for detecting motifs?

Input layer

Convolutional layer

Pooling layer

Output layer

Explanation - Convolutional filters learn local patterns (motifs) across the sequence.

Correct answer is: Convolutional layer

Q.138 Which of the following best explains why 'log‑transformation' is used on count data before machine learning?

It linearizes relationships between variables

It reduces the effect of outliers and stabilizes variance

It makes data categorical

It speeds up training

Explanation - Log‑transform mitigates skewness in count data, improving model performance.

Correct answer is: It reduces the effect of outliers and stabilizes variance

Q.139 What is the main purpose of using a 'validation set' during hyperparameter tuning?

To evaluate model performance on unseen data

To replace the training set

To increase the dataset size

To compute the final test accuracy

Explanation - The validation set provides an unbiased estimate of how hyperparameters affect performance.

Correct answer is: To evaluate model performance on unseen data

Q.140 Which of the following is a common technique to handle high dimensional gene expression data?

Principal Component Analysis (PCA)

Feature selection

Both A and B

None of the above

Explanation - Both PCA and feature selection reduce dimensionality to mitigate the curse of dimensionality.

Correct answer is: Both A and B

Q.141 In a supervised learning model, what does 'overfitting' mean?

The model performs well on test data

The model performs poorly on training data

The model captures noise from training data and fails to generalize

The model has too few parameters

Explanation - Overfitting occurs when a model learns training data intricacies that do not generalize.

Correct answer is: The model captures noise from training data and fails to generalize

Q.142 Which of the following best describes a 'graph convolutional network' (GCN)?

A neural network that processes sequences via convolution

A neural network that processes graph‑structured data via convolution

A clustering algorithm

A reinforcement learning agent

Explanation - GCNs generalize convolution operations to graph data, making them suitable for network biology.

Correct answer is: A neural network that processes graph‑structured data via convolution

Q.143 Which metric would you use to evaluate a multi‑class classification model if all classes are equally important?

Accuracy

Macro‑averaged F1‑Score

Micro‑averaged Precision

All of the above

Explanation - Macro‑averaging treats all classes equally, providing a balanced metric.

Correct answer is: Macro‑averaged F1‑Score

Q.144 What is the main advantage of using an 'ensemble' of models in bioinformatics?

It always improves accuracy regardless of data

It reduces the variance and improves robustness compared to a single model

It eliminates the need for feature selection

It speeds up training time

Explanation - Ensembling combines multiple models, averaging out individual errors.

Correct answer is: It reduces the variance and improves robustness compared to a single model

Q.145 Which of the following is an example of a supervised learning task in bioinformatics?

Clustering gene expression profiles

Predicting drug response based on genotype

Dimensionality reduction of genomic data

Identifying motifs in DNA sequences

Explanation - Drug response prediction uses labeled outcomes (response vs. non‑response).

Correct answer is: Predicting drug response based on genotype

Q.146 Which of the following best explains the 'gradient descent' algorithm?

A method for computing the optimal hyperparameters

An iterative optimization algorithm that updates parameters in the direction of the steepest descent of the loss function

A data preprocessing technique

A type of loss function

Explanation - Gradient descent finds minima by following the negative gradient of the loss.

Correct answer is: An iterative optimization algorithm that updates parameters in the direction of the steepest descent of the loss function

Q.147 In the context of neural networks, what does 'weight decay' refer to?

Increasing the weight values during training

Adding a penalty term to the loss function to reduce large weights

Removing weights that are zero

Normalizing input features

Explanation - Weight decay acts like L2 regularization, discouraging large weight values.

Correct answer is: Adding a penalty term to the loss function to reduce large weights

Q.148 Which of the following is NOT a typical component of a deep learning pipeline for predicting protein‑structure?

Data preprocessing

Model training

Hyperparameter tuning

Feature selection only

Explanation - While feature selection can be used, deep learning typically learns representations directly.

Correct answer is: Feature selection only

Q.149 Which metric provides an overall sense of a binary classifier’s performance across all threshold settings?

Precision

Recall

Accuracy

Area Under the ROC Curve (AUC‑ROC)

Explanation - AUC‑ROC aggregates performance across thresholds, summarizing sensitivity vs. specificity trade‑offs.

Correct answer is: Area Under the ROC Curve (AUC‑ROC)

Q.150 What does 'learning rate' control in neural network training?

The speed at which the model updates its weights

The number of layers in the network

The number of epochs

The size of the training set

Explanation - The learning rate determines how large a step the optimizer takes in weight space.

Correct answer is: The speed at which the model updates its weights

Q.151 Which of the following is a common way to evaluate the performance of an autoencoder?

Cross‑entropy loss

Reconstruction error (e.g., MSE)

Accuracy

Precision

Explanation - Autoencoders are evaluated based on how well they reconstruct input data.

Correct answer is: Reconstruction error (e.g., MSE)

Q.152 What is the main advantage of using 'transfer learning' for a bioinformatics problem with limited data?

It reduces the need for large labeled datasets by leveraging pre‑trained models

It eliminates the need for feature scaling

It guarantees perfect accuracy

It increases the number of layers automatically

Explanation - Transfer learning uses knowledge from related tasks to improve learning efficiency.

Correct answer is: It reduces the need for large labeled datasets by leveraging pre‑trained models

Q.153 Which of the following best describes the 'softmax' activation function?

It returns a value between 0 and 1 for each class, summing to 1

It outputs a binary decision

It normalizes features to zero mean

It is used for regression tasks

Explanation - Softmax converts logits to a probability distribution over classes.

Correct answer is: It returns a value between 0 and 1 for each class, summing to 1

Q.154 Which of the following is a benefit of using an 'ensemble' approach in bioinformatics?

It always improves accuracy

It reduces overfitting by averaging predictions

It eliminates the need for preprocessing

It speeds up training time

Explanation - Ensembles combine multiple models, mitigating individual model variance.

Correct answer is: It reduces overfitting by averaging predictions

Q.155 In a multi‑class classification problem, which metric measures the harmonic mean of precision and recall for each class?

F1‑Score

Accuracy

Recall

Precision

Explanation - The F1‑score aggregates precision and recall into a single metric per class.

Correct answer is: F1‑Score

Q.156 Which of the following is an example of a 'synthetic' dataset in machine learning?

Real‑world clinical data

Data generated by a simulation or generative model

A database of genomic sequences

A set of experimental measurements

Explanation - Synthetic data is artificially created rather than collected from experiments.

Correct answer is: Data generated by a simulation or generative model

Q.157 What is the primary role of a 'loss function' in training a machine learning model?

To regularize the model

To measure the difference between predicted and true values

To initialize weights

To determine the number of layers

Explanation - The loss guides weight updates by quantifying prediction error.

Correct answer is: To measure the difference between predicted and true values

Q.158 Which of the following is an advantage of using 'cross‑validation' during model development?

It increases training time

It provides a more reliable estimate of model performance

It reduces the number of hyperparameters

It eliminates the need for a test set

Explanation - Cross‑validation uses multiple splits, reducing variance in performance estimates.

Correct answer is: It provides a more reliable estimate of model performance

Q.159 Which of the following best describes a 'feature' in the context of machine learning?

A label for training

A single piece of information used to represent an instance

A hyperparameter of the model

The final prediction of the model

Explanation - Features are input attributes that the model uses to make predictions.

Correct answer is: A single piece of information used to represent an instance

Q.160 In a supervised learning scenario for disease classification, which of the following is considered a 'label'?

Gene expression levels

The patient’s age

The diagnosis (e.g., cancer vs. healthy)

The number of sequencing reads

Explanation - The label is the target variable the model learns to predict.

Correct answer is: The diagnosis (e.g., cancer vs. healthy)

Q.161 Which of the following best explains why 't‑SNE' is often used for visualizing high‑dimensional biological data?

It reduces dimensionality while preserving global structure

It preserves local neighborhood relationships in lower dimensions

It is computationally cheap

It always yields linear clusters

Explanation - t‑SNE captures local similarities, making clusters visible in 2‑D/3‑D plots.

Correct answer is: It preserves local neighborhood relationships in lower dimensions