Q.1 What is the primary objective of supervised learning in gene expression analysis?
To cluster similar genes together
To predict the expression level of a gene given a set of features
To reduce dimensionality of the data
To identify outliers in the dataset
Explanation - Supervised learning uses labeled data (e.g., expression levels) to train models that can predict outcomes for new, unseen data.
Correct answer is: To predict the expression level of a gene given a set of features
Q.2 Which evaluation metric is most appropriate for assessing a protein‑binding site classifier when the classes are highly imbalanced?
Accuracy
Precision
Recall
Area under the ROC curve (AUC)
Explanation - AUC evaluates the trade‑off between true positive and false positive rates across all thresholds, making it suitable for imbalanced datasets.
Correct answer is: Area under the ROC curve (AUC)
Q.3 In a convolutional neural network (CNN) applied to DNA sequences, what does a 1‑D convolution kernel learn?
Spatial patterns across 2D images
Temporal patterns in time series data
Motifs or local patterns along the nucleotide sequence
Global features of the whole sequence
Explanation - 1‑D convolutions slide a kernel along a single dimension, capturing local motifs within the linear sequence of nucleotides.
Correct answer is: Motifs or local patterns along the nucleotide sequence
Q.4 What is the main advantage of using a Random Forest over a single decision tree in predicting cancer subtypes?
It requires less computational power
It can handle missing values automatically
It reduces overfitting by averaging multiple trees
It always achieves 100% accuracy
Explanation - Random Forests aggregate predictions from many trees, smoothing out individual tree variance and improving generalization.
Correct answer is: It reduces overfitting by averaging multiple trees
Q.5 Which feature selection method removes irrelevant genes before feeding data into a machine learning model?
Principal Component Analysis (PCA)
Recursive Feature Elimination (RFE)
K‑Means clustering
t‑SNE
Explanation - RFE ranks features by importance and recursively discards the least important ones, effectively reducing dimensionality.
Correct answer is: Recursive Feature Elimination (RFE)
Q.6 Transfer learning in bioinformatics often starts with a model pre‑trained on which type of data?
Protein‑structure images
Genomic sequences from other species
Clinical lab measurements
Patient questionnaires
Explanation - Pre‑trained models on related species' genomes can transfer learned motifs, improving performance on limited human data.
Correct answer is: Genomic sequences from other species
Q.7 Which unsupervised learning technique is commonly used to identify sub‑populations of cells in single‑cell RNA‑seq data?
Linear Regression
K‑Means clustering
Logistic Regression
Support Vector Machine (SVM)
Explanation - K‑Means partitions cells into clusters based on expression similarity, revealing distinct cell types or states.
Correct answer is: K‑Means clustering
Q.8 In reinforcement learning applied to protein folding, the agent receives a reward when which event occurs?
The protein forms a disulfide bond
The protein achieves a lower free‑energy conformation
The protein folds faster than predicted
The protein contains more alpha‑helices
Explanation - Lower free energy indicates a more stable, biologically relevant fold, serving as a natural reward signal.
Correct answer is: The protein achieves a lower free‑energy conformation
Q.9 What does the term 'cross‑validation' mean in machine learning?
Testing a model on unseen data from a different domain
Training a model with random noise added
Splitting data into training and testing subsets multiple times
Using a single large dataset for training only
Explanation - Cross‑validation evaluates model robustness by repeatedly partitioning the data.
Correct answer is: Splitting data into training and testing subsets multiple times
Q.10 Why are one‑hot encodings commonly used for DNA sequence data before feeding them into a neural network?
They reduce the dimensionality of the data
They preserve the order of nucleotides
They convert categorical nucleotides into numeric vectors
They make the data sparse and thus faster to compute
Explanation - One‑hot encoding transforms each nucleotide into a binary vector, enabling the network to process categorical information numerically.
Correct answer is: They convert categorical nucleotides into numeric vectors
Q.11 Which of the following is NOT a typical step in preprocessing RNA‑seq data for machine learning?
Read alignment
Normalization
Feature extraction
Real‑time visualization
Explanation - While visualization aids interpretation, it is not a preprocessing step for model training.
Correct answer is: Real‑time visualization
Q.12 Dropout is a regularization technique that does what during training?
Adds random noise to the input data
Removes a random subset of neurons from the network each iteration
Normalizes the output of each layer
Permanently deletes underperforming weights
Explanation - Dropout randomly drops neurons to prevent over‑reliance on any particular feature.
Correct answer is: Removes a random subset of neurons from the network each iteration
Q.13 Which type of neural network is most suitable for modeling time‑dependent gene expression data?
Convolutional Neural Network (CNN)
Recurrent Neural Network (RNN)
Feed‑forward Neural Network
Autoencoder
Explanation - RNNs capture sequential dependencies, making them ideal for time‑series gene expression profiles.
Correct answer is: Recurrent Neural Network (RNN)
Q.14 What is the main goal of using an autoencoder on high‑dimensional gene expression data?
To classify cell types
To reduce dimensionality while preserving key patterns
To generate synthetic genes
To identify mutations
Explanation - Autoencoders learn compressed representations that retain important variance in the data.
Correct answer is: To reduce dimensionality while preserving key patterns
Q.15 In a supervised classification task, the F1‑score is a combination of which two metrics?
Accuracy and specificity
Precision and recall
Sensitivity and specificity
Precision and accuracy
Explanation - F1 is the harmonic mean of precision and recall, balancing false positives and false negatives.
Correct answer is: Precision and recall
Q.16 Which of these is an example of a structured output prediction problem in bioinformatics?
Predicting the presence of a disease
Predicting the secondary structure of a protein
Clustering patient records
Estimating the age of a patient
Explanation - Secondary structure prediction involves outputting a sequence of labels (e.g., helix, sheet) aligned with the input sequence.
Correct answer is: Predicting the secondary structure of a protein
Q.17 What role does the 'kernel trick' play in Support Vector Machines?
It reduces the number of training samples
It transforms data into a higher‑dimensional space to make it linearly separable
It normalizes the input features
It speeds up the training process
Explanation - The kernel trick allows SVMs to operate in implicit high‑dimensional feature spaces without explicit mapping.
Correct answer is: It transforms data into a higher‑dimensional space to make it linearly separable
Q.18 Which performance metric is most informative when a model’s false positives are more costly than false negatives in a disease‑diagnosis scenario?
Sensitivity
Specificity
Accuracy
ROC AUC
Explanation - Specificity measures the proportion of true negatives, reducing the rate of false positives.
Correct answer is: Specificity
Q.19 Why is batch normalization used in deep learning models for bioinformatics?
To normalize gene expression levels across samples
To reduce internal covariate shift and stabilize training
To remove batch effects from the data
To increase the size of the training dataset
Explanation - Batch normalization standardizes activations, speeding up convergence and improving robustness.
Correct answer is: To reduce internal covariate shift and stabilize training
Q.20 Which of the following best describes 'transfer learning' in the context of predicting protein‑protein interactions?
Training a model from scratch using only human data
Using a pre‑trained model on a related task and fine‑tuning it on protein‑protein data
Transferring data from one species to another without any adaptation
Applying genetic algorithms to optimize the model
Explanation - Transfer learning leverages knowledge from a source task to improve performance on a target task.
Correct answer is: Using a pre‑trained model on a related task and fine‑tuning it on protein‑protein data
Q.21 In a k‑NN classifier, how does increasing 'k' affect the model's bias‑variance trade‑off?
Increases bias and decreases variance
Decreases bias and increases variance
Has no effect on bias or variance
Increases both bias and variance
Explanation - A larger k smooths predictions, reducing variance but increasing bias.
Correct answer is: Increases bias and decreases variance
Q.22 What is a key difference between a one‑class SVM and a binary SVM?
One‑class SVM can only handle continuous data
One‑class SVM is used for anomaly detection
Binary SVM requires labeled data while one‑class does not
Both statements are true
Explanation - One‑class SVM learns a boundary around normal data, flagging outliers as anomalies.
Correct answer is: One‑class SVM is used for anomaly detection
Q.23 Which type of deep learning architecture is commonly applied to predict the 3D structure of proteins from their amino‑acid sequence?
Recurrent Neural Network (RNN)
Convolutional Neural Network (CNN)
Graph Neural Network (GNN)
Feed‑forward Neural Network
Explanation - GNNs naturally model residues as nodes and interactions as edges, capturing spatial relationships.
Correct answer is: Graph Neural Network (GNN)
Q.24 Which technique is used to handle class imbalance in training a machine learning model for rare genetic disease prediction?
Undersampling the majority class
Oversampling the minority class using SMOTE
Adding noise to the minority class
All of the above
Explanation - Various resampling strategies, including SMOTE, help balance the dataset for better learning.
Correct answer is: All of the above
Q.25 In feature extraction for DNA microarrays, what is the purpose of using a sliding window approach?
To generate overlapping sequences for training
To increase the number of samples by duplication
To normalize probe intensities
To filter out low‑quality probes
Explanation - Sliding windows create local subsequences, enabling models to capture regional motifs.
Correct answer is: To generate overlapping sequences for training
Q.26 Which loss function is commonly used for multi‑class classification in deep learning?
Mean Squared Error (MSE)
Cross‑Entropy Loss
Huber Loss
Poisson Loss
Explanation - Cross‑entropy measures the difference between predicted probabilities and true one‑hot labels.
Correct answer is: Cross‑Entropy Loss
Q.27 In a genetic algorithm used to optimize a neural network architecture, which component mimics biological evolution?
Selection
Mutation
Crossover
All of the above
Explanation - These operators simulate survival of the fittest, random mutation, and recombination.
Correct answer is: All of the above
Q.28 What is the main reason to use t‑SNE for visualizing high‑dimensional bioinformatics data?
It preserves global distances accurately
It is computationally efficient for very large datasets
It preserves local structure, revealing clusters
It reduces dimensionality to one dimension only
Explanation - t‑SNE focuses on preserving local neighborhoods, making it great for cluster visualization.
Correct answer is: It preserves local structure, revealing clusters
Q.29 Which of these is an example of a supervised learning algorithm?
K‑Means clustering
Principal Component Analysis (PCA)
Support Vector Machine (SVM) for classification
Hierarchical clustering
Explanation - SVM uses labeled data to find a decision boundary, making it supervised.
Correct answer is: Support Vector Machine (SVM) for classification
Q.30 During the training of a recurrent neural network on mRNA secondary structure prediction, why is teacher forcing applied?
To accelerate convergence by using the true previous output during training
To enforce weight sharing across time steps
To prevent overfitting by dropout
To regularize the loss function
Explanation - Teacher forcing feeds the correct output at each step, speeding up training of sequential models.
Correct answer is: To accelerate convergence by using the true previous output during training
Q.31 Which type of data augmentation is particularly useful when training CNNs on protein surface images?
Horizontal flipping
Rotations around multiple axes
Adding Gaussian noise to pixel values
Color jittering
Explanation - Protein surfaces are 3‑D, so rotations preserve structural validity while increasing dataset diversity.
Correct answer is: Rotations around multiple axes
Q.32 Which of the following metrics best captures the trade‑off between sensitivity and specificity in a binary classifier?
F1‑Score
Matthews Correlation Coefficient (MCC)
Precision
Recall
Explanation - MCC considers true/false positives and negatives, providing a balanced measure even with class imbalance.
Correct answer is: Matthews Correlation Coefficient (MCC)
Q.33 In the context of next‑generation sequencing (NGS) data, what does a 'coverage depth' of 30× mean?
Each base is read 30 times on average
30% of the genome is covered by reads
Reads span 30 base‑pairs on average
The data set contains 30 million reads
Explanation - Coverage depth refers to the average number of times each nucleotide position is sequenced.
Correct answer is: Each base is read 30 times on average
Q.34 What is the purpose of using a confusion matrix in evaluating a disease‑prediction model?
To calculate the ROC curve
To summarize prediction outcomes across all classes
To measure the model’s loss
To normalize the feature space
Explanation - A confusion matrix shows true/false positives/negatives, providing insight into specific error types.
Correct answer is: To summarize prediction outcomes across all classes
Q.35 Which regularization technique adds a penalty equal to the absolute value of weights to the loss function?
L1 regularization
L2 regularization
Elastic Net
Dropout
Explanation - L1 regularization encourages sparsity by penalizing the absolute magnitude of weights.
Correct answer is: L1 regularization
Q.36 Which of these is NOT an example of a supervised learning model commonly used in bioinformatics?
Decision Tree
Random Forest
k‑Nearest Neighbors
Gaussian Mixture Model
Explanation - Gaussian Mixture Models are unsupervised, used for clustering rather than classification.
Correct answer is: Gaussian Mixture Model
Q.37 What does the 'softmax' function output in a neural network?
Binary predictions (0 or 1)
A probability distribution over multiple classes
Raw logits for regression
A single continuous value
Explanation - Softmax normalizes logits into probabilities that sum to one.
Correct answer is: A probability distribution over multiple classes
Q.38 When training a convolutional network for histopathology image classification, why might one use a pre‑trained ResNet as a feature extractor?
Because it eliminates the need for labeled data
Because it learns generic visual features that transfer to medical images
Because it guarantees perfect accuracy on medical data
Because it reduces the need for GPU resources
Explanation - ResNet's lower layers capture edges and textures common across images, aiding transfer learning.
Correct answer is: Because it learns generic visual features that transfer to medical images
Q.39 Which algorithm is specifically designed to handle data with high dimensionality and sparse features, such as gene expression datasets?
Naïve Bayes
Support Vector Machine (SVM) with linear kernel
k‑Nearest Neighbors
Decision Tree
Explanation - Linear SVMs efficiently process high‑dimensional sparse data by optimizing a hyperplane.
Correct answer is: Support Vector Machine (SVM) with linear kernel
Q.40 Which of the following best describes the 'bagging' strategy used in Random Forests?
Training a single large model on the entire dataset
Combining multiple weak learners trained on random subsets of data
Using bootstrapped datasets to reduce bias
Pruning trees to avoid overfitting
Explanation - Bagging averages predictions of many trees, each built on a bootstrap sample.
Correct answer is: Combining multiple weak learners trained on random subsets of data
Q.41 In a bioinformatics pipeline, what is the main purpose of 'variant calling' after DNA sequencing?
To identify differences between the sequenced genome and a reference genome
To assemble the genome from short reads
To filter out low‑quality reads
To predict gene function
Explanation - Variant calling detects SNPs, insertions, deletions, and other genetic alterations.
Correct answer is: To identify differences between the sequenced genome and a reference genome
Q.42 Which metric would you use to evaluate a regression model predicting drug concentrations?
Accuracy
Mean Absolute Error (MAE)
Area Under Curve (AUC)
F1‑Score
Explanation - MAE measures average absolute deviation between predicted and actual continuous values.
Correct answer is: Mean Absolute Error (MAE)
Q.43 What does the term 'overfitting' refer to in the context of machine learning?
When a model performs poorly on the training data
When a model learns noise and performs well only on training data
When a model generalizes too well
When a model has too few parameters
Explanation - Overfitting results in high training accuracy but low test performance due to memorizing noise.
Correct answer is: When a model learns noise and performs well only on training data
Q.44 Which type of neural network is best suited for processing data represented as graphs, such as protein‑protein interaction networks?
Convolutional Neural Network (CNN)
Graph Neural Network (GNN)
Recurrent Neural Network (RNN)
Fully Connected Network
Explanation - GNNs directly operate on graph structures, aggregating node information from neighbors.
Correct answer is: Graph Neural Network (GNN)
Q.45 Why is it important to split a dataset into training, validation, and test sets?
To ensure the model trains quickly
To evaluate the model’s ability to generalize to unseen data
To avoid using a GPU during training
To increase the number of training samples
Explanation - Separating data prevents over‑optimistic estimates of performance.
Correct answer is: To evaluate the model’s ability to generalize to unseen data
Q.46 Which of the following best describes a 'kernel' in machine learning?
A method of data augmentation
A function that measures similarity between data points
A type of loss function
A regularization technique
Explanation - Kernel functions compute dot products in high‑dimensional feature spaces.
Correct answer is: A function that measures similarity between data points
Q.47 What is the main benefit of using a hierarchical clustering approach for phylogenetic tree construction?
It is the fastest clustering method available
It produces a tree that reflects evolutionary relationships
It requires labeled data
It always yields 100% accurate trees
Explanation - Hierarchical clustering groups sequences based on pairwise distances, forming a tree structure.
Correct answer is: It produces a tree that reflects evolutionary relationships
Q.48 In the context of deep learning, what is the primary role of an activation function?
To reduce the dimensionality of the input
To introduce non‑linearity into the model
To compute the loss value
To perform back‑propagation
Explanation - Activation functions allow networks to learn complex patterns beyond linear combinations.
Correct answer is: To introduce non‑linearity into the model
Q.49 Which of these is a common approach for handling missing values in gene expression data?
Delete all samples with missing values
Impute missing values using the mean of the gene expression
Replace missing values with zero
Ignore the missing values during training
Explanation - Mean imputation preserves overall data structure while handling missing entries.
Correct answer is: Impute missing values using the mean of the gene expression
Q.50 What does 'batch size' refer to in neural network training?
The number of epochs to train
The number of samples processed before updating weights
The total size of the training dataset
The size of the hidden layer
Explanation - Batch size determines how many samples are used per gradient update.
Correct answer is: The number of samples processed before updating weights
Q.51 Which technique is used to prevent a neural network from learning noise in the data during training?
Overfitting
Regularization
Bootstrapping
Cross‑validation
Explanation - Regularization adds constraints to the model to avoid overfitting to noise.
Correct answer is: Regularization
Q.52 In bioinformatics, what is a 'blast' algorithm typically used for?
To predict protein folding
To align nucleotide or protein sequences against a database
To design primers for PCR
To simulate cellular pathways
Explanation - BLAST searches a query sequence against a database to find similar sequences.
Correct answer is: To align nucleotide or protein sequences against a database
Q.53 Which of the following is a characteristic of an unsupervised learning algorithm?
It requires labeled target values
It only works on binary classification
It discovers patterns without explicit labels
It always outputs a regression value
Explanation - Unsupervised learning finds structure in unlabeled data.
Correct answer is: It discovers patterns without explicit labels
Q.54 What is the purpose of 'one‑hot encoding' in processing categorical genomic data?
To compress the data
To convert categories into binary vectors
To reduce noise
To increase the dataset size
Explanation - One‑hot encoding represents each category as a separate binary feature, enabling numeric processing.
Correct answer is: To convert categories into binary vectors
Q.55 Which of the following best describes 'feature importance' in tree‑based models?
The weight assigned to each input node
The contribution of each feature to model predictions
The number of times a feature is used in a tree
All of the above
Explanation - Feature importance measures how much a feature influences the model’s decisions.
Correct answer is: All of the above
Q.56 Why is it important to use a 'validation set' during hyperparameter tuning?
To estimate how well the model will perform on unseen data
To reduce the size of the training dataset
To calculate the final test accuracy
To determine the number of epochs
Explanation - Validation data provides a realistic assessment of hyperparameter choices without biasing the test set.
Correct answer is: To estimate how well the model will perform on unseen data
Q.57 What is the role of the 'learning rate' in gradient‑based optimization?
It determines how often the model is evaluated
It controls the step size in updating weights
It sets the maximum number of epochs
It decides how many layers are added
Explanation - A higher learning rate may converge faster but risks overshooting, while a lower rate is more stable.
Correct answer is: It controls the step size in updating weights
Q.58 Which deep learning architecture is designed to preserve spatial hierarchies while reducing parameter count?
DenseNet
ResNet
AlexNet
VGG
Explanation - DenseNet connects each layer to every other layer, enabling efficient feature reuse and fewer parameters.
Correct answer is: DenseNet
Q.59 In a classification model, what does a confusion matrix entry of 'False Positive' represent?
A sample correctly classified as negative
A sample incorrectly classified as positive
A sample correctly classified as positive
A sample incorrectly classified as negative
Explanation - A false positive occurs when the model predicts the positive class for an actual negative sample.
Correct answer is: A sample incorrectly classified as positive
Q.60 Which method is commonly used to address high dimensionality in microarray data before classification?
Normalization
Feature selection
Data augmentation
Hyperparameter optimization
Explanation - Feature selection reduces the number of genes considered, improving model performance.
Correct answer is: Feature selection
Q.61 What does 'dropout' accomplish during the training of a neural network?
It adds noise to the inputs
It randomly drops hidden units to prevent co‑adaptation
It ensures the model uses all data points
It reduces the size of the output layer
Explanation - Dropout forces the network to learn redundant representations, reducing overfitting.
Correct answer is: It randomly drops hidden units to prevent co‑adaptation
Q.62 Which algorithm is most suitable for performing dimensionality reduction while preserving non‑linear relationships?
Principal Component Analysis (PCA)
t‑SNE
Linear Discriminant Analysis (LDA)
K‑Means clustering
Explanation - t‑SNE captures complex, non‑linear structures in high‑dimensional data.
Correct answer is: t‑SNE
Q.63 In a deep learning model for protein‑binding site detection, which layer type would be most effective for capturing local interaction patterns?
Fully connected layer
Convolutional layer
Recurrent layer
Pooling layer
Explanation - Convolutional layers slide filters over the input, detecting localized motifs relevant for binding.
Correct answer is: Convolutional layer
Q.64 What is a 'hyperparameter' in the context of training a neural network?
A parameter learned during training
A fixed value set before training
A weight that updates during back‑propagation
A regularization coefficient
Explanation - Hyperparameters (e.g., learning rate, batch size) are set externally and not updated during training.
Correct answer is: A fixed value set before training
Q.65 Which evaluation metric is most appropriate when the number of negative instances greatly exceeds positives in a disease detection scenario?
Accuracy
Precision
Recall
AUC‑PR (Precision‑Recall curve)
Explanation - AUC‑PR focuses on precision–recall trade‑off, especially useful for imbalanced data.
Correct answer is: AUC‑PR (Precision‑Recall curve)
Q.66 What is the purpose of a 'learning schedule' in neural network training?
To adjust the learning rate over epochs
To determine the number of layers
To fix the model architecture
To split data into training and test sets
Explanation - Learning schedules reduce the learning rate during training to improve convergence.
Correct answer is: To adjust the learning rate over epochs
Q.67 Which type of neural network is best suited for modeling sequences such as DNA or RNA?
Convolutional Neural Network (CNN)
Recurrent Neural Network (RNN)
Autoencoder
Decision Tree
Explanation - RNNs capture dependencies across sequence positions, essential for biological sequences.
Correct answer is: Recurrent Neural Network (RNN)
Q.68 Which of the following is a key advantage of using a Support Vector Machine (SVM) for classifying genomic data?
It does not require any feature scaling
It can handle high‑dimensional data efficiently
It always achieves 100% accuracy
It does not need labeled data
Explanation - SVMs find a separating hyperplane in high‑dimensional spaces, suitable for genomic features.
Correct answer is: It can handle high‑dimensional data efficiently
Q.69 What does the 'softmax' function do in a classification neural network?
It normalizes logits into a probability distribution
It applies a non‑linear activation to hidden layers
It computes the gradient for back‑propagation
It reduces dimensionality of the input
Explanation - Softmax transforms raw scores into probabilities that sum to one.
Correct answer is: It normalizes logits into a probability distribution
Q.70 Which of the following is an example of an unsupervised learning technique used in bioinformatics?
Support Vector Machine (SVM)
K‑Means clustering
Logistic Regression
Gradient Boosting
Explanation - K‑Means groups data points into clusters without using labels, a common unsupervised approach.
Correct answer is: K‑Means clustering
Q.71 In a genetic algorithm, what is the role of 'crossover'?
To create new individuals by combining parts of two parents
To mutate random bits in an individual
To evaluate fitness of an individual
To select individuals for reproduction
Explanation - Crossover exchanges genetic material between parents, producing new offspring.
Correct answer is: To create new individuals by combining parts of two parents
Q.72 Which of the following best describes the 'curse of dimensionality'?
High dimensional data can be processed faster
Distance metrics become less informative as dimensions increase
It is a problem only in low‑dimensional data
It only affects neural networks
Explanation - In high dimensions, points tend to be equidistant, making nearest‑neighbor methods ineffective.
Correct answer is: Distance metrics become less informative as dimensions increase
Q.73 Which machine learning algorithm is specifically designed to model time‑series data with memory of previous states?
Random Forest
Support Vector Machine
Recurrent Neural Network (RNN)
k‑Nearest Neighbors
Explanation - RNNs maintain hidden states that capture temporal dependencies.
Correct answer is: Recurrent Neural Network (RNN)
Q.74 Which evaluation metric is commonly used to measure how well a clustering algorithm has grouped similar items together?
Silhouette Score
Accuracy
Precision
Recall
Explanation - The silhouette score evaluates cohesion and separation of clusters.
Correct answer is: Silhouette Score
Q.75 What is the main purpose of 'cross‑entropy loss' in a classification problem?
To compute the difference between predicted and true values in regression
To penalize incorrect probabilistic predictions
To enforce sparsity in weights
To measure similarity between sequences
Explanation - Cross‑entropy loss quantifies how far predicted probabilities diverge from true labels.
Correct answer is: To penalize incorrect probabilistic predictions
Q.76 Which of the following is a key advantage of using an autoencoder for gene expression data?
It always achieves perfect reconstruction
It reduces dimensionality while preserving key patterns
It eliminates the need for labeled data
It requires no computational resources
Explanation - Autoencoders learn a compressed representation that captures essential structure.
Correct answer is: It reduces dimensionality while preserving key patterns
Q.77 In machine learning pipelines, what is 'feature scaling' and why is it important?
It normalizes input values to a similar range, improving optimization convergence
It selects the most important features automatically
It increases the dimensionality of the data
It converts categorical data into numeric data
Explanation - Feature scaling prevents attributes with large ranges from dominating the learning process.
Correct answer is: It normalizes input values to a similar range, improving optimization convergence
Q.78 Which of these neural network architectures is best suited for capturing long‑range dependencies in genomic sequences?
Feed‑forward network
Convolutional neural network (CNN)
Long Short‑Term Memory (LSTM) network
Autoencoder
Explanation - LSTMs maintain memory over long sequences, ideal for genomic data with distant motifs.
Correct answer is: Long Short‑Term Memory (LSTM) network
Q.79 What does 'epoch' refer to in the context of training a neural network?
The number of layers in the network
The number of times the entire training dataset is passed through the network
The learning rate schedule
The number of neurons in the output layer
Explanation - An epoch is one full pass of all training samples during learning.
Correct answer is: The number of times the entire training dataset is passed through the network
Q.80 Which technique is used to evaluate the performance of a regression model predicting drug concentrations?
Mean Absolute Error (MAE)
Accuracy
Precision
Recall
Explanation - MAE measures the average absolute difference between predicted and actual values.
Correct answer is: Mean Absolute Error (MAE)
Q.81 In the context of next‑generation sequencing, what does a 'read alignment' step accomplish?
It assembles the reads into a complete genome
It matches sequencing reads to a reference genome
It removes duplicate reads
It predicts gene expression levels
Explanation - Read alignment aligns short sequence reads to a reference for variant calling.
Correct answer is: It matches sequencing reads to a reference genome
Q.82 Which of the following is a common challenge when working with RNA‑seq data for machine learning?
The data is too small for training
The data has high dimensionality and sparse counts
The data is always balanced
The data requires no preprocessing
Explanation - RNA‑seq generates thousands of genes with many zeros, complicating learning.
Correct answer is: The data has high dimensionality and sparse counts
Q.83 What does 'k‑fold cross‑validation' involve?
Training the model k times on the entire dataset
Dividing the data into k subsets and rotating the test set
Using k as the number of hidden layers
Scaling features by a factor of k
Explanation - k‑fold CV partitions data into k folds, each used once for testing.
Correct answer is: Dividing the data into k subsets and rotating the test set
Q.84 Which metric is most appropriate when evaluating a model that predicts a continuous value in a bioinformatics context?
Accuracy
Precision
Mean Squared Error (MSE)
Recall
Explanation - MSE is a standard loss for regression tasks measuring squared difference.
Correct answer is: Mean Squared Error (MSE)
Q.85 Which of the following best describes a 'hyperplane' in the context of Support Vector Machines?
A two‑dimensional plane separating classes
A line connecting two support vectors
A decision boundary that maximizes margin between classes
A set of weights used in training
Explanation - An SVM hyperplane separates classes with the largest possible margin.
Correct answer is: A decision boundary that maximizes margin between classes
Q.86 In a convolutional neural network, what is the purpose of the 'pooling' layer?
To increase the number of features
To reduce spatial dimensions and computation
To initialize weights
To compute the loss function
Explanation - Pooling down‑samples feature maps, reducing parameters and overfitting risk.
Correct answer is: To reduce spatial dimensions and computation
Q.87 Which of the following is a key advantage of using a ReLU activation function in neural networks?
It is non‑linear and helps with vanishing gradients
It always outputs values between 0 and 1
It reduces the number of parameters
It performs well with small datasets
Explanation - ReLU introduces non‑linearity and mitigates vanishing gradients for deep nets.
Correct answer is: It is non‑linear and helps with vanishing gradients
Q.88 What is the primary purpose of a 'loss function' in machine learning?
To compute the gradient for back‑propagation
To measure the difference between predicted and true values
To select the best hyperparameters
To reduce overfitting
Explanation - The loss function quantifies prediction error, guiding weight updates.
Correct answer is: To measure the difference between predicted and true values
Q.89 Which of the following best describes a 'kernel trick' in SVM?
Computing distances in the input space
Transforming data into a higher‑dimensional space via a kernel function
Using a different loss function
Reducing dimensionality with PCA
Explanation - The kernel trick allows SVMs to operate implicitly in high‑dimensional feature spaces.
Correct answer is: Transforming data into a higher‑dimensional space via a kernel function
Q.90 Which type of neural network is specifically designed to capture spatial hierarchies in data such as images or 1‑D sequences?
Recurrent Neural Network (RNN)
Convolutional Neural Network (CNN)
Autoencoder
Feed‑forward Neural Network
Explanation - CNNs use convolutional layers to learn hierarchical spatial features.
Correct answer is: Convolutional Neural Network (CNN)
Q.91 Which of the following metrics is most appropriate for evaluating the performance of a model that detects rare genetic variants?
Accuracy
Recall
Precision
Matthews Correlation Coefficient (MCC)
Explanation - MCC provides a balanced measure even with extreme class imbalance.
Correct answer is: Matthews Correlation Coefficient (MCC)
Q.92 What does the 'bias term' do in a neural network layer?
It shifts the activation function output to center around zero
It regularizes the model to prevent overfitting
It increases the number of neurons
It is added to the loss function
Explanation - The bias term adjusts the output of a neuron independently of its inputs.
Correct answer is: It shifts the activation function output to center around zero
Q.93 Which technique is commonly used to prevent overfitting in deep learning models?
Batch normalization
Data augmentation
Dropout
All of the above
Explanation - All listed methods help reduce overfitting by regularizing or augmenting data.
Correct answer is: All of the above
Q.94 In a classification problem with many classes, which metric would you use to compare the performance of different models?
Overall accuracy
Macro‑averaged F1‑score
Micro‑averaged F1‑score
All of the above
Explanation - Macro‑averaging gives equal weight to all classes, useful when classes are imbalanced.
Correct answer is: Macro‑averaged F1‑score
Q.95 What does 'one‑class SVM' primarily aim to detect in bioinformatics applications?
Normal samples and flag anomalies
The most frequent class in the data
All classes equally
The class with the largest margin
Explanation - One‑class SVM learns a boundary around normal data, labeling outliers as anomalies.
Correct answer is: Normal samples and flag anomalies
Q.96 Which of the following is NOT a common preprocessing step for RNA‑seq data before machine learning?
Normalization
Batch effect correction
Feature scaling
Data duplication
Explanation - Data duplication is generally avoided as it can introduce bias.
Correct answer is: Data duplication
Q.97 What is the purpose of a 'feature importance score' in a Random Forest model?
To measure how frequently a feature is used in the model
To evaluate the model’s accuracy
To determine the optimal learning rate
To calculate the loss function
Explanation - Feature importance indicates the influence of each feature on predictions.
Correct answer is: To measure how frequently a feature is used in the model
Q.98 In the context of bioinformatics, what does the term 'motif' refer to?
A short, recurring pattern in DNA or protein sequences
A computational algorithm for alignment
A type of neural network
A statistical test for significance
Explanation - Motifs are conserved sequence patterns associated with functional or structural roles.
Correct answer is: A short, recurring pattern in DNA or protein sequences
Q.99 Which of the following best describes the 'dropout rate' in a neural network?
The learning rate used during training
The proportion of neurons randomly dropped during each training step
The fraction of data used for validation
The number of epochs to train
Explanation - Dropout rate specifies how many neurons are temporarily disabled to prevent overfitting.
Correct answer is: The proportion of neurons randomly dropped during each training step
Q.100 What is the primary difference between a 'CNN' and a 'RNN' for sequence data?
CNNs capture local patterns, RNNs capture sequential dependencies
CNNs require more parameters than RNNs
RNNs can only handle images, CNNs only handle text
CNNs are unsupervised, RNNs are supervised
Explanation - CNNs slide filters over the sequence, whereas RNNs maintain state across positions.
Correct answer is: CNNs capture local patterns, RNNs capture sequential dependencies
Q.101 Which evaluation metric is best suited to assess a model that predicts whether a variant is pathogenic or benign?
Accuracy
Recall
Precision
Area under the Precision‑Recall curve (AUC‑PR)
Explanation - AUC‑PR focuses on the trade‑off between precision and recall, important for imbalanced variant data.
Correct answer is: Area under the Precision‑Recall curve (AUC‑PR)
Q.102 In a neural network, what is the role of the 'softmax' layer?
To compute probabilities for multi‑class classification
To perform dimensionality reduction
To normalize inputs
To calculate gradient updates
Explanation - Softmax converts raw logits into a probability distribution over classes.
Correct answer is: To compute probabilities for multi‑class classification
Q.103 Which of the following is a common method for reducing the dimensionality of gene expression data before classification?
Feature selection
PCA
Both A and B
None of the above
Explanation - Both feature selection and PCA are widely used to reduce dimensionality.
Correct answer is: Both A and B
Q.104 What is the purpose of the 'learning rate decay' schedule in deep learning?
To increase the learning rate over time
To gradually reduce the learning rate during training
To adjust the batch size dynamically
To change the number of epochs
Explanation - Learning rate decay helps refine convergence as training progresses.
Correct answer is: To gradually reduce the learning rate during training
Q.105 Which of the following describes a 'Gaussian Mixture Model (GMM)'?
A supervised classification model
An unsupervised clustering algorithm based on Gaussian distributions
A linear regression technique
A type of neural network
Explanation - GMM assumes data points are generated from a mixture of Gaussians.
Correct answer is: An unsupervised clustering algorithm based on Gaussian distributions
Q.106 Which of the following best describes the 'Adam' optimizer in training deep neural networks?
It uses only first‑order gradients
It adapts learning rates for each parameter based on first and second moments of gradients
It is equivalent to Stochastic Gradient Descent (SGD)
It requires no hyperparameters
Explanation - Adam computes adaptive learning rates using moving averages of gradients and squared gradients.
Correct answer is: It adapts learning rates for each parameter based on first and second moments of gradients
Q.107 What is the main purpose of using a 'validation set' during model development?
To evaluate the model’s performance on unseen data and tune hyperparameters
To compute the final test accuracy
To increase the size of the training dataset
To replace the training set
Explanation - The validation set guides hyperparameter selection without leaking test information.
Correct answer is: To evaluate the model’s performance on unseen data and tune hyperparameters
Q.108 Which of the following best describes a 'deep learning' model?
A model with more than two hidden layers
A model that uses shallow, single‑layer networks
A model that does not require labeled data
A model that only works with images
Explanation - Deep learning refers to neural networks with many hierarchical layers.
Correct answer is: A model with more than two hidden layers
Q.109 In a classification problem, which metric gives equal importance to both classes in an imbalanced dataset?
Accuracy
Precision
Recall
Balanced Accuracy
Explanation - Balanced accuracy averages recall across classes, mitigating class imbalance bias.
Correct answer is: Balanced Accuracy
Q.110 Which of the following is an example of a 'reinforcement learning' application in bioinformatics?
Predicting protein‑protein interactions
Training an agent to design stable protein folds
Classifying gene expression profiles
Identifying motifs in DNA sequences
Explanation - Reinforcement learning can optimize protein design by rewarding stable conformations.
Correct answer is: Training an agent to design stable protein folds
Q.111 What is the main challenge of applying deep learning to small bioinformatics datasets?
Model overfitting due to limited data
Inability to capture non‑linear relationships
Difficulty in visualizing results
Large memory requirements
Explanation - Small datasets increase overfitting risk, requiring regularization or data augmentation.
Correct answer is: Model overfitting due to limited data
Q.112 Which of the following best describes a 'one‑hot encoding' scheme for amino acid sequences?
Each amino acid is represented by a real number between 0 and 1
Each amino acid is encoded as a binary vector of length 20
All amino acids are assigned the same vector
It converts sequences into images
Explanation - One‑hot vectors represent each of the 20 standard amino acids uniquely.
Correct answer is: Each amino acid is encoded as a binary vector of length 20
Q.113 In a machine learning pipeline, what is 'normalization' most commonly applied to?
Label data
Input features
Output predictions
Training epochs
Explanation - Normalizing features ensures they are on comparable scales for efficient learning.
Correct answer is: Input features
Q.114 What does the 'batch size' determine in a training loop?
The number of epochs to train
The number of samples processed before the model updates weights
The learning rate schedule
The size of the training dataset
Explanation - Batch size controls how many data points are used to compute a single gradient update.
Correct answer is: The number of samples processed before the model updates weights
Q.115 Which metric is especially informative when dealing with highly imbalanced datasets in classification?
Accuracy
Precision
Recall
Matthews Correlation Coefficient (MCC)
Explanation - MCC takes all four confusion matrix categories into account, providing a balanced measure.
Correct answer is: Matthews Correlation Coefficient (MCC)
Q.116 Which of the following best describes an 'autoencoder' in machine learning?
A supervised classifier
An unsupervised model that learns a compressed representation of input data
A clustering algorithm
A reinforcement learning agent
Explanation - Autoencoders encode input data into a lower‑dimensional latent space and reconstruct it.
Correct answer is: An unsupervised model that learns a compressed representation of input data
Q.117 In the context of genomics, what does 'variant calling' involve?
Aligning sequencing reads to a reference genome
Identifying differences between the sample genome and the reference
Predicting gene expression levels
Normalizing read counts
Explanation - Variant calling detects SNPs, indels, and other genomic alterations.
Correct answer is: Identifying differences between the sample genome and the reference
Q.118 Which of the following is a key benefit of using a 'pre‑trained language model' for predicting RNA‑secondary structure?
It requires no training data
It captures sequence context learned from vast RNA corpora
It always achieves perfect accuracy
It eliminates the need for feature engineering
Explanation - Pre‑trained language models encode rich contextual information that can be fine‑tuned for specific tasks.
Correct answer is: It captures sequence context learned from vast RNA corpora
Q.119 Which of the following is a typical preprocessing step before feeding RNA‑seq counts into a machine learning model?
Log transformation
One‑hot encoding
Feature selection
All of the above
Explanation - Log transform stabilizes variance, one‑hot encoding handles categorical data, and feature selection reduces dimensionality.
Correct answer is: All of the above
Q.120 What does 'cross‑entropy' measure in a classification context?
The distance between two probability distributions
The mean squared error between predictions and labels
The variance of predictions
The correlation coefficient
Explanation - Cross‑entropy quantifies the dissimilarity between predicted probabilities and true labels.
Correct answer is: The distance between two probability distributions
Q.121 Which of the following best explains why 'dropout' can improve model generalization?
It reduces the number of parameters in the model
It encourages the network to learn redundant representations
It speeds up training by skipping computations
It prevents the model from learning any patterns
Explanation - Dropout forces neurons to be robust by not depending on any single feature.
Correct answer is: It encourages the network to learn redundant representations
Q.122 Which of the following is an advantage of using a 'graph neural network' in bioinformatics?
It can directly process relational data such as protein interaction networks
It always requires fewer training samples
It is only applicable to image data
It cannot handle large graphs
Explanation - GNNs operate on graph structures, making them ideal for relational biological data.
Correct answer is: It can directly process relational data such as protein interaction networks
Q.123 In a supervised learning task for predicting protein‑protein binding affinity, which of the following would be considered a 'label'?
The amino‑acid sequence of the protein
The 3D structure of the protein complex
A continuous value representing binding free energy
The number of genes expressed
Explanation - The label is the target variable the model is trained to predict.
Correct answer is: A continuous value representing binding free energy
Q.124 What is the primary function of an 'activation function' in a neural network layer?
To add noise to the input
To introduce non‑linearity into the network
To compute the loss value
To reduce dimensionality
Explanation - Activation functions enable neural networks to model complex patterns.
Correct answer is: To introduce non‑linearity into the network
Q.125 Which of the following metrics is most appropriate for a binary classification problem with a highly skewed class distribution?
Accuracy
Precision
Recall
Matthews Correlation Coefficient (MCC)
Explanation - MCC provides a balanced metric that accounts for all confusion matrix entries.
Correct answer is: Matthews Correlation Coefficient (MCC)
Q.126 What does 't‑SNE' primarily aim to preserve when reducing dimensionality?
Global pairwise distances
Local neighborhood relationships
Variance of the data
Data density in high dimensional space
Explanation - t‑SNE focuses on preserving local structure, producing meaningful clusters.
Correct answer is: Local neighborhood relationships
Q.127 Which of the following best describes a 'feature vector' in machine learning?
A single numeric value representing a sample
A list of numeric attributes describing a sample
A set of labels for training
A model architecture
Explanation - A feature vector encodes all relevant information about an instance for learning.
Correct answer is: A list of numeric attributes describing a sample
Q.128 In a supervised learning setting for disease risk prediction, why is it important to separate the data into training and testing sets?
To avoid overfitting and obtain a realistic estimate of model performance
To increase the size of the training data
To reduce the computational load
To remove irrelevant features
Explanation - Separating data ensures that performance metrics reflect generalization to new samples.
Correct answer is: To avoid overfitting and obtain a realistic estimate of model performance
Q.129 Which type of neural network is best suited for learning patterns from 3‑D volumetric data, such as cryo‑EM images?
1‑D CNN
2‑D CNN
3‑D CNN
Recurrent Neural Network (RNN)
Explanation - 3‑D CNNs can capture spatial relationships across three dimensions.
Correct answer is: 3‑D CNN
Q.130 What is a typical use of a 'latent space' in a generative model such as a VAE?
To store training labels
To compress input data into a lower‑dimensional representation for generation
To accelerate training speed
To evaluate model accuracy
Explanation - The latent space captures essential features that can be decoded into new samples.
Correct answer is: To compress input data into a lower‑dimensional representation for generation
Q.131 Which of the following best describes the 'Adam' optimizer's key feature?
It uses a fixed learning rate
It adapts the learning rate for each parameter based on gradient moments
It is equivalent to standard gradient descent
It requires no hyperparameters
Explanation - Adam computes adaptive learning rates using moving averages of gradients and their squares.
Correct answer is: It adapts the learning rate for each parameter based on gradient moments
Q.132 Which of the following is NOT typically used for dimensionality reduction in gene expression data?
PCA
t‑SNE
SMOTE
Autoencoder
Explanation - SMOTE is a resampling technique for handling class imbalance, not dimensionality reduction.
Correct answer is: SMOTE
Q.133 In a binary classification task, what does a high false‑negative rate indicate?
Many positive cases are incorrectly labeled as negative
Many negative cases are incorrectly labeled as positive
The model has high precision
The model has high accuracy
Explanation - False negatives occur when positives are missed, reducing recall.
Correct answer is: Many positive cases are incorrectly labeled as negative
Q.134 Which of the following best describes the concept of 'regularization' in machine learning?
Adding noise to the data
Adding a penalty term to the loss function to prevent overfitting
Increasing the number of epochs
Normalizing input features
Explanation - Regularization discourages overly complex models by penalizing large weights.
Correct answer is: Adding a penalty term to the loss function to prevent overfitting
Q.135 What is a 'latent variable' in a generative model?
An observable variable in the dataset
A hidden variable that explains observed data patterns
A feature that is always zero
A hyperparameter of the model
Explanation - Latent variables capture underlying structure that generates the observed data.
Correct answer is: A hidden variable that explains observed data patterns
Q.136 Which evaluation metric is specifically designed for regression tasks in bioinformatics?
Accuracy
F1‑Score
Root Mean Square Error (RMSE)
Recall
Explanation - RMSE measures the average magnitude of prediction errors in regression.
Correct answer is: Root Mean Square Error (RMSE)
Q.137 In a CNN used for predicting DNA binding sites, which layer is responsible for detecting motifs?
Input layer
Convolutional layer
Pooling layer
Output layer
Explanation - Convolutional filters learn local patterns (motifs) across the sequence.
Correct answer is: Convolutional layer
Q.138 Which of the following best explains why 'log‑transformation' is used on count data before machine learning?
It linearizes relationships between variables
It reduces the effect of outliers and stabilizes variance
It makes data categorical
It speeds up training
Explanation - Log‑transform mitigates skewness in count data, improving model performance.
Correct answer is: It reduces the effect of outliers and stabilizes variance
Q.139 What is the main purpose of using a 'validation set' during hyperparameter tuning?
To evaluate model performance on unseen data
To replace the training set
To increase the dataset size
To compute the final test accuracy
Explanation - The validation set provides an unbiased estimate of how hyperparameters affect performance.
Correct answer is: To evaluate model performance on unseen data
Q.140 Which of the following is a common technique to handle high dimensional gene expression data?
Principal Component Analysis (PCA)
Feature selection
Both A and B
None of the above
Explanation - Both PCA and feature selection reduce dimensionality to mitigate the curse of dimensionality.
Correct answer is: Both A and B
Q.141 In a supervised learning model, what does 'overfitting' mean?
The model performs well on test data
The model performs poorly on training data
The model captures noise from training data and fails to generalize
The model has too few parameters
Explanation - Overfitting occurs when a model learns training data intricacies that do not generalize.
Correct answer is: The model captures noise from training data and fails to generalize
Q.142 Which of the following best describes a 'graph convolutional network' (GCN)?
A neural network that processes sequences via convolution
A neural network that processes graph‑structured data via convolution
A clustering algorithm
A reinforcement learning agent
Explanation - GCNs generalize convolution operations to graph data, making them suitable for network biology.
Correct answer is: A neural network that processes graph‑structured data via convolution
Q.143 Which metric would you use to evaluate a multi‑class classification model if all classes are equally important?
Accuracy
Macro‑averaged F1‑Score
Micro‑averaged Precision
All of the above
Explanation - Macro‑averaging treats all classes equally, providing a balanced metric.
Correct answer is: Macro‑averaged F1‑Score
Q.144 What is the main advantage of using an 'ensemble' of models in bioinformatics?
It always improves accuracy regardless of data
It reduces the variance and improves robustness compared to a single model
It eliminates the need for feature selection
It speeds up training time
Explanation - Ensembling combines multiple models, averaging out individual errors.
Correct answer is: It reduces the variance and improves robustness compared to a single model
Q.145 Which of the following is an example of a supervised learning task in bioinformatics?
Clustering gene expression profiles
Predicting drug response based on genotype
Dimensionality reduction of genomic data
Identifying motifs in DNA sequences
Explanation - Drug response prediction uses labeled outcomes (response vs. non‑response).
Correct answer is: Predicting drug response based on genotype
Q.146 Which of the following best explains the 'gradient descent' algorithm?
A method for computing the optimal hyperparameters
An iterative optimization algorithm that updates parameters in the direction of the steepest descent of the loss function
A data preprocessing technique
A type of loss function
Explanation - Gradient descent finds minima by following the negative gradient of the loss.
Correct answer is: An iterative optimization algorithm that updates parameters in the direction of the steepest descent of the loss function
Q.147 In the context of neural networks, what does 'weight decay' refer to?
Increasing the weight values during training
Adding a penalty term to the loss function to reduce large weights
Removing weights that are zero
Normalizing input features
Explanation - Weight decay acts like L2 regularization, discouraging large weight values.
Correct answer is: Adding a penalty term to the loss function to reduce large weights
Q.148 Which of the following is NOT a typical component of a deep learning pipeline for predicting protein‑structure?
Data preprocessing
Model training
Hyperparameter tuning
Feature selection only
Explanation - While feature selection can be used, deep learning typically learns representations directly.
Correct answer is: Feature selection only
Q.149 Which metric provides an overall sense of a binary classifier’s performance across all threshold settings?
Precision
Recall
Accuracy
Area Under the ROC Curve (AUC‑ROC)
Explanation - AUC‑ROC aggregates performance across thresholds, summarizing sensitivity vs. specificity trade‑offs.
Correct answer is: Area Under the ROC Curve (AUC‑ROC)
Q.150 What does 'learning rate' control in neural network training?
The speed at which the model updates its weights
The number of layers in the network
The number of epochs
The size of the training set
Explanation - The learning rate determines how large a step the optimizer takes in weight space.
Correct answer is: The speed at which the model updates its weights
Q.151 Which of the following is a common way to evaluate the performance of an autoencoder?
Cross‑entropy loss
Reconstruction error (e.g., MSE)
Accuracy
Precision
Explanation - Autoencoders are evaluated based on how well they reconstruct input data.
Correct answer is: Reconstruction error (e.g., MSE)
Q.152 What is the main advantage of using 'transfer learning' for a bioinformatics problem with limited data?
It reduces the need for large labeled datasets by leveraging pre‑trained models
It eliminates the need for feature scaling
It guarantees perfect accuracy
It increases the number of layers automatically
Explanation - Transfer learning uses knowledge from related tasks to improve learning efficiency.
Correct answer is: It reduces the need for large labeled datasets by leveraging pre‑trained models
Q.153 Which of the following best describes the 'softmax' activation function?
It returns a value between 0 and 1 for each class, summing to 1
It outputs a binary decision
It normalizes features to zero mean
It is used for regression tasks
Explanation - Softmax converts logits to a probability distribution over classes.
Correct answer is: It returns a value between 0 and 1 for each class, summing to 1
Q.154 Which of the following is a benefit of using an 'ensemble' approach in bioinformatics?
It always improves accuracy
It reduces overfitting by averaging predictions
It eliminates the need for preprocessing
It speeds up training time
Explanation - Ensembles combine multiple models, mitigating individual model variance.
Correct answer is: It reduces overfitting by averaging predictions
Q.155 In a multi‑class classification problem, which metric measures the harmonic mean of precision and recall for each class?
F1‑Score
Accuracy
Recall
Precision
Explanation - The F1‑score aggregates precision and recall into a single metric per class.
Correct answer is: F1‑Score
Q.156 Which of the following is an example of a 'synthetic' dataset in machine learning?
Real‑world clinical data
Data generated by a simulation or generative model
A database of genomic sequences
A set of experimental measurements
Explanation - Synthetic data is artificially created rather than collected from experiments.
Correct answer is: Data generated by a simulation or generative model
Q.157 What is the primary role of a 'loss function' in training a machine learning model?
To regularize the model
To measure the difference between predicted and true values
To initialize weights
To determine the number of layers
Explanation - The loss guides weight updates by quantifying prediction error.
Correct answer is: To measure the difference between predicted and true values
Q.158 Which of the following is an advantage of using 'cross‑validation' during model development?
It increases training time
It provides a more reliable estimate of model performance
It reduces the number of hyperparameters
It eliminates the need for a test set
Explanation - Cross‑validation uses multiple splits, reducing variance in performance estimates.
Correct answer is: It provides a more reliable estimate of model performance
Q.159 Which of the following best describes a 'feature' in the context of machine learning?
A label for training
A single piece of information used to represent an instance
A hyperparameter of the model
The final prediction of the model
Explanation - Features are input attributes that the model uses to make predictions.
Correct answer is: A single piece of information used to represent an instance
Q.160 In a supervised learning scenario for disease classification, which of the following is considered a 'label'?
Gene expression levels
The patient’s age
The diagnosis (e.g., cancer vs. healthy)
The number of sequencing reads
Explanation - The label is the target variable the model learns to predict.
Correct answer is: The diagnosis (e.g., cancer vs. healthy)
Q.161 Which of the following best explains why 't‑SNE' is often used for visualizing high‑dimensional biological data?
It reduces dimensionality while preserving global structure
It preserves local neighborhood relationships in lower dimensions
It is computationally cheap
It always yields linear clusters
Explanation - t‑SNE captures local similarities, making clusters visible in 2‑D/3‑D plots.
Correct answer is: It preserves local neighborhood relationships in lower dimensions
