Q.1 Which omics layer measures the complete set of messenger RNA transcripts present in a cell at a given time?
Genomics
Proteomics
Transcriptomics
Metabolomics
Explanation - Transcriptomics focuses on profiling all RNA molecules, especially mRNA, to understand gene expression levels.
Correct answer is: Transcriptomics
Q.2 In the context of omics data integration, which technique combines gene expression data with protein‑protein interaction networks to identify active pathways?
Principal Component Analysis (PCA)
Weighted Gene Co‑expression Network Analysis (WGCNA)
Gene Set Enrichment Analysis (GSEA)
Network‑based Integration
Explanation - Network‑based integration overlays expression values onto interaction maps, revealing which pathways are active under the studied condition.
Correct answer is: Network‑based Integration
Q.3 Which database is primarily used for storing curated metabolic pathway information suitable for metabolomics integration?
KEGG
Reactome
UniProt
Pfam
Explanation - KEGG (Kyoto Encyclopedia of Genes and Genomes) provides detailed maps of metabolic pathways, widely used in metabolomics studies.
Correct answer is: KEGG
Q.4 What is the main purpose of the 'multi‑omics factor analysis' (MOFA) method?
To perform differential expression analysis on RNA‑seq data
To identify latent factors that explain variance across multiple omics layers
To predict protein tertiary structure
To align short DNA reads to a reference genome
Explanation - MOFA discovers hidden factors that capture shared and specific sources of variation across heterogeneous omics datasets.
Correct answer is: To identify latent factors that explain variance across multiple omics layers
Q.5 Which of the following is NOT a typical challenge when integrating genomics, transcriptomics, proteomics, and metabolomics data?
Different measurement scales
Missing data in one layer
Uniform sample preparation protocols
Varying data dimensionality
Explanation - Sample preparation differs greatly between omics platforms, creating batch effects and compatibility issues.
Correct answer is: Uniform sample preparation protocols
Q.6 In systems biology, what does the term 'flux balance analysis' (FBA) primarily model?
Gene regulatory networks
Steady‑state metabolic fluxes
Protein folding pathways
RNA splicing events
Explanation - FBA uses stoichiometric models of metabolism to predict the distribution of fluxes under steady‑state assumptions.
Correct answer is: Steady‑state metabolic fluxes
Q.7 Which machine‑learning algorithm is commonly employed to predict phenotype from high‑dimensional multi‑omics data?
k‑Nearest Neighbours (k‑NN)
Random Forest
Linear Discriminant Analysis (LDA)
Naïve Bayes
Explanation - Random Forest handles many features, can assess variable importance, and works well with heterogeneous omics data.
Correct answer is: Random Forest
Q.8 What does the acronym 'eQTL' stand for in integrative genomics?
electronic Quantitative Trait Locus
expression Quantitative Trait Locus
enhancer Quantitative Transcription Level
epigenetic Quantitative Transfer Locus
Explanation - eQTLs are genomic loci that explain variation in gene expression levels across individuals.
Correct answer is: expression Quantitative Trait Locus
Q.9 Which statistical test is most appropriate for comparing metabolite concentrations between two experimental groups when the data are not normally distributed?
Student's t‑test
Mann‑Whitney U test
ANOVA
Chi‑square test
Explanation - The Mann‑Whitney U test is a non‑parametric alternative to the t‑test for comparing two independent groups.
Correct answer is: Mann‑Whitney U test
Q.10 In an integrated omics network, nodes typically represent:
Experimental time points
Biological entities such as genes, proteins, or metabolites
Statistical p‑values
Sequencing read lengths
Explanation - Network nodes correspond to entities (genes, proteins, metabolites) while edges represent relationships (interactions, correlations).
Correct answer is: Biological entities such as genes, proteins, or metabolites
Q.11 Which of the following is a common format for storing raw mass‑spectrometry proteomics data?
FASTQ
mzML
BED
GTF
Explanation - mzML is an open XML‑based format for mass‑spectrometry data, facilitating proteomics data exchange.
Correct answer is: mzML
Q.12 What is the main advantage of using a Bayesian framework for multi‑omics integration?
It eliminates the need for data preprocessing
It naturally incorporates prior knowledge and quantifies uncertainty
It guarantees faster computation than frequentist methods
It only works with binary data
Explanation - Bayesian models can embed prior information (e.g., known pathways) and provide posterior distributions reflecting uncertainty.
Correct answer is: It naturally incorporates prior knowledge and quantifies uncertainty
Q.13 Which of the following pipelines is specifically designed for integrating RNA‑seq and proteomics data?
MetaPhlAn
DESeq2
iProX
iProMix
Explanation - iProMix (integrated Proteomics and Transcriptomics) aligns gene expression with protein abundance for joint analysis.
Correct answer is: iProMix
Q.14 In the context of electrical engineering inspired models for systems biology, what does a 'circuit' typically represent?
A series of DNA nucleotides
A set of interconnected biochemical reactions
A hardware device for measuring voltage
A computer algorithm for sorting data
Explanation - Circuit analogies map biochemical pathways to electrical circuits, where nodes are metabolites and edges are reactions (currents).
Correct answer is: A set of interconnected biochemical reactions
Q.15 Which dimension‑reduction technique is frequently applied to visualize multi‑omics data in two dimensions?
t‑Distributed Stochastic Neighbor Embedding (t‑SNE)
BLAST
HMMER
BWA
Explanation - t‑SNE preserves local structure and is popular for visualizing high‑dimensional omics datasets.
Correct answer is: t‑Distributed Stochastic Neighbor Embedding (t‑SNE)
Q.16 When integrating omics data, a 'canonical correlation analysis' (CCA) primarily aims to:
Identify the most expressed gene
Find linear combinations of variables that are maximally correlated across datasets
Cluster samples into discrete groups
Predict protein tertiary structure
Explanation - CCA discovers pairs of weighted variables (one per dataset) with the highest correlation, useful for cross‑modal associations.
Correct answer is: Find linear combinations of variables that are maximally correlated across datasets
Q.17 Which of the following best describes a 'metabolite‑gene interaction network'?
A graph where metabolites are connected only to each other
A bipartite graph linking metabolites to the genes encoding enzymes that act on them
A hierarchical tree of gene families
A linear pathway of DNA replication
Explanation - Bipartite networks have two node types (metabolites and genes) and edges represent enzymatic reactions or regulatory links.
Correct answer is: A bipartite graph linking metabolites to the genes encoding enzymes that act on them
Q.18 Which R package provides functions for the integration of multi‑omics data using a similarity network fusion (SNF) approach?
edgeR
SNFtool
limma
DESeq2
Explanation - SNFtool implements similarity network fusion, merging patient similarity graphs from each omics layer.
Correct answer is: SNFtool
Q.19 What does the term 'batch effect' refer to in omics experiments?
The influence of biological variation on data
Systematic non‑biological differences introduced by processing samples in different batches
A type of neural network layer
The random noise generated by sequencing machines
Explanation - Batch effects arise from variations in reagents, instruments, or operators and must be corrected before integration.
Correct answer is: Systematic non‑biological differences introduced by processing samples in different batches
Q.20 In a multi‑omics study of cancer, which of the following would be considered a 'phenotypic' endpoint?
DNA methylation level
Gene expression fold‑change
Patient survival time
Metabolite concentration
Explanation - Phenotypic endpoints describe observable outcomes, such as survival, disease stage, or response to therapy.
Correct answer is: Patient survival time
Q.21 Which of the following is a key benefit of using cloud‑based platforms (e.g., Terra, Galaxy) for omics data integration?
They eliminate the need for statistical analysis
They provide scalable storage and compute resources, enabling reproducible pipelines
They guarantee 100 % data security
They automatically generate biological hypotheses
Explanation - Cloud platforms allow large datasets to be processed efficiently and support workflow sharing for reproducibility.
Correct answer is: They provide scalable storage and compute resources, enabling reproducible pipelines
Q.22 Which method uses a graph‑based representation to combine gene expression and protein interaction data for module detection?
Markov Cluster Algorithm (MCL)
Fast Fourier Transform (FFT)
Hidden Markov Model (HMM)
BLAST
Explanation - MCL simulates random walks on a graph to identify densely connected clusters (modules) across integrated data.
Correct answer is: Markov Cluster Algorithm (MCL)
Q.23 In the context of signal processing applied to omics time‑series data, what does the term 'Fourier transform' accomplish?
Converts the data from the time domain to the frequency domain
Normalizes gene expression values
Aligns DNA sequences
Predicts protein secondary structure
Explanation - Fourier transform decomposes a time‑dependent signal into its constituent frequencies, useful for detecting periodic patterns.
Correct answer is: Converts the data from the time domain to the frequency domain
Q.24 Which of the following best describes 'data imputation' in multi‑omics integration?
Removing all samples with missing values
Estimating missing measurements using statistical or machine‑learning models
Doubling the size of the dataset
Converting continuous data to binary
Explanation - Imputation fills gaps to enable downstream analyses that require complete matrices.
Correct answer is: Estimating missing measurements using statistical or machine‑learning models
Q.25 Which technology is primarily used to generate quantitative proteomics data for integration with other omics layers?
RNA‑seq
SILAC (Stable Isotope Labelling by Amino acids in Cell culture)
ChIP‑seq
ATAC‑seq
Explanation - SILAC incorporates heavy isotopes into proteins, allowing accurate relative quantification by mass spectrometry.
Correct answer is: SILAC (Stable Isotope Labelling by Amino acids in Cell culture)
Q.26 What does the term 'omics' plural refer to?
A single type of molecular data
A collection of various high‑throughput molecular datasets (genomics, proteomics, etc.)
A statistical test
An electrical circuit component
Explanation - The suffix ‘‑omics’ denotes comprehensive profiling of a particular molecular class; plural indicates multiple layers.
Correct answer is: A collection of various high‑throughput molecular datasets (genomics, proteomics, etc.)
Q.27 Which of the following is a common format for representing metabolic network models used in flux balance analysis?
SBML (Systems Biology Markup Language)
FASTA
GFF3
VCF
Explanation - SBML encodes biochemical reaction networks, enabling exchange of FBA models across tools.
Correct answer is: SBML (Systems Biology Markup Language)
Q.28 In multi‑omics integration, what is a 'latent variable'?
A directly measured metabolite concentration
A hidden factor inferred from the data that captures shared variation
A sequencing error rate
The temperature at which the experiment was performed
Explanation - Latent variables are not directly observed but are estimated to explain patterns across datasets (e.g., in MOFA).
Correct answer is: A hidden factor inferred from the data that captures shared variation
Q.29 Which of the following approaches is specifically designed for integrating binary (presence/absence) and continuous omics data?
Partial Least Squares (PLS)
Jaccard similarity
Mixed‑type data integration via Generalized Canonical Correlation Analysis (GCCA)
Chi‑square test
Explanation - GCCA extends CCA to handle heterogeneous data types, including binary and continuous variables.
Correct answer is: Mixed‑type data integration via Generalized Canonical Correlation Analysis (GCCA)
Q.30 Which of the following best describes the purpose of a 'heatmap' in omics data visualization?
To display hierarchical clustering of samples and features based on similarity
To compute statistical p‑values
To model electrical circuits
To store raw sequencing reads
Explanation - Heatmaps use color gradients to represent values and often include dendrograms for clustering patterns.
Correct answer is: To display hierarchical clustering of samples and features based on similarity
Q.31 In a study combining genomics and metabolomics, a significant association between a SNP and a metabolite level is called:
eQTL
mQTL
pQTL
sQTL
Explanation - mQTL (metabolite quantitative trait locus) links genetic variants to metabolite concentrations.
Correct answer is: mQTL
Q.32 Which programming language is most widely used for large‑scale omics data manipulation and integration?
Java
Python
MATLAB
C++
Explanation - Python, together with libraries like pandas, scikit‑learn, and Biopython, is popular for handling and integrating omics data.
Correct answer is: Python
Q.33 When performing pathway enrichment analysis on integrated transcriptomics‑proteomics data, which statistic is commonly used to assess over‑representation?
Pearson correlation
Fisher's exact test
Kolmogorov‑Smirnov test
Log‑rank test
Explanation - Fisher's exact test evaluates whether a set of genes/proteins is enriched in a pathway more than expected by chance.
Correct answer is: Fisher's exact test
Q.34 Which of the following describes the concept of 'cross‑omics correlation'?
Correlation among samples within a single omics layer
Correlation between measurements from different omics layers (e.g., gene expression vs. metabolite level)
Correlation between hardware components in an electrical circuit
Correlation between different species' genomes
Explanation - Cross‑omics correlation quantifies how variations in one molecular layer relate to another, revealing functional links.
Correct answer is: Correlation between measurements from different omics layers (e.g., gene expression vs. metabolite level)
Q.35 Which of the following is a widely used benchmark dataset for testing multi‑omics integration pipelines?
CIFAR‑10
TCGA (The Cancer Genome Atlas)
ImageNet
COCO
Explanation - TCGA provides matched genomics, transcriptomics, proteomics, and clinical data for many cancer types, ideal for integration studies.
Correct answer is: TCGA (The Cancer Genome Atlas)
Q.36 In a multi‑omics graph, an edge weight representing 'confidence' is most likely derived from:
The length of the DNA sequence
The p‑value of an interaction experiment
The temperature of the lab
The color of the metabolite
Explanation - Edge weights often encode statistical confidence (e.g., inverse p‑value) or strength of interaction.
Correct answer is: The p‑value of an interaction experiment
Q.37 Which of the following methods can be used to assess the robustness of a multi‑omics clustering result?
Silhouette score
Fast Fourier Transform
PCR amplification
Gel electrophoresis
Explanation - Silhouette scores quantify how well each sample fits within its cluster versus other clusters.
Correct answer is: Silhouette score
Q.38 What does the term 'omics integration' most directly imply?
Sequencing a new genome
Merging data from different molecular layers to gain a holistic view of the biological system
Building a physical circuit board
Running a single‑gene knockout experiment
Explanation - Integration combines genomics, transcriptomics, proteomics, metabolomics, etc., to capture system‑level behavior.
Correct answer is: Merging data from different molecular layers to gain a holistic view of the biological system
Q.39 Which of the following is an advantage of using 'sparse' methods (e.g., Sparse CCA) for multi‑omics integration?
They require no computational resources
They select a subset of informative features, improving interpretability and reducing over‑fitting
They guarantee 100 % accuracy
They only work with single‑cell data
Explanation - Sparsity constraints force many coefficients to zero, highlighting key biomarkers.
Correct answer is: They select a subset of informative features, improving interpretability and reducing over‑fitting
Q.40 In the context of signal processing for omics, what does 'filtering' typically accomplish?
Removing low‑quality or noisy measurements from the dataset
Amplifying the DNA sequences
Changing the color of a gel image
Increasing the sample size
Explanation - Filtering eliminates unwanted variation (e.g., low‑count genes) to improve downstream analysis.
Correct answer is: Removing low‑quality or noisy measurements from the dataset
Q.41 Which of the following is a standard method to correct for multiple hypothesis testing in omics studies?
Bonferroni correction
Euclidean distance
Maximum likelihood estimation
FastQC
Explanation - Bonferroni and Benjamini‑Hochberg are common approaches to control false discovery rates when many tests are performed.
Correct answer is: Bonferroni correction
Q.42 When integrating single‑cell RNA‑seq with single‑cell ATAC‑seq data, a common strategy is to:
Ignore the ATAC‑seq data
Map chromatin accessibility peaks to nearby genes and jointly embed both modalities
Convert ATAC‑seq reads into proteins
Perform a Western blot
Explanation - Linking ATAC‑seq peaks to gene promoters enables joint dimensionality reduction (e.g., using Seurat or LIGER).
Correct answer is: Map chromatin accessibility peaks to nearby genes and jointly embed both modalities
Q.43 Which of the following is a key metric to evaluate the predictive performance of a multi‑omics classifier?
Area Under the Receiver Operating Characteristic curve (AUROC)
Mean GC content
Number of chromosomes
Length of the protein coding region
Explanation - AUROC measures the ability of a classifier to discriminate between classes across all thresholds.
Correct answer is: Area Under the Receiver Operating Characteristic curve (AUROC)
Q.44 Which of the following best describes a 'meta‑analysis' in the context of omics data?
Combining results from multiple independent studies to increase statistical power
Sequencing the genome of a new organism
Running a single experiment on a single sample
Designing a new laboratory instrument
Explanation - Meta‑analysis aggregates findings across studies, often using standardized effect sizes.
Correct answer is: Combining results from multiple independent studies to increase statistical power
Q.45 In the context of electrical engineering analogies, what does the term 'capacitance' correspond to in a biological network?
The ability of a metabolite pool to buffer concentration changes
The speed of DNA replication
The number of chromosomes
The intensity of fluorescence
Explanation - Capacitance stores charge; analogously, large metabolite pools can absorb fluctuations, acting as a buffer.
Correct answer is: The ability of a metabolite pool to buffer concentration changes
Q.46 Which of the following is a commonly used statistical model for integrating count‑based RNA‑seq data with continuous proteomics measurements?
Negative binomial regression with Gaussian residuals
Linear regression with Poisson errors
Logistic regression with binary outcome
Decision tree without any distributional assumptions
Explanation - RNA‑seq counts follow a negative binomial distribution; proteomics is often modeled as Gaussian, requiring mixed‑distribution models.
Correct answer is: Negative binomial regression with Gaussian residuals
Q.47 Which of the following software tools is specifically designed for visualizing multi‑omics networks?
Cytoscape
BLAST
BWA
FastQC
Explanation - Cytoscape provides flexible visualization and analysis of complex biological networks, including multi‑omics data.
Correct answer is: Cytoscape
Q.48 In a multi‑omics study, a 'feature' usually refers to:
A lab technician
An individual measurable variable such as a gene, protein, or metabolite
The temperature of the incubator
A type of electrical resistor
Explanation - Features are the columns of a data matrix representing molecular entities measured in the experiment.
Correct answer is: An individual measurable variable such as a gene, protein, or metabolite
Q.49 Which of the following is an example of a 'knowledge‑based' integration method?
Using a pre‑compiled pathway database to guide the merging of transcriptomics and metabolomics data
Randomly shuffling the data matrices
Applying a generic clustering algorithm without any prior information
Discarding all missing values
Explanation - Knowledge‑based methods incorporate prior biological knowledge (e.g., pathways) to inform integration.
Correct answer is: Using a pre‑compiled pathway database to guide the merging of transcriptomics and metabolomics data
Q.50 When integrating epigenomics (e.g., DNA methylation) with transcriptomics, a negative correlation between promoter methylation and gene expression suggests:
Methylation activates transcription
Methylation represses transcription
Methylation has no effect on transcription
Methylation only affects protein folding
Explanation - Promoter hyper‑methylation generally silences gene expression, leading to an inverse relationship.
Correct answer is: Methylation represses transcription
Q.51 Which of the following is a typical preprocessing step for metabolomics data before integration?
Peak detection, alignment, and normalization
Base‑calling of DNA reads
Splice junction mapping
K‑mer counting
Explanation - Metabolomics pipelines first detect ion peaks, align across samples, and normalize for batch effects.
Correct answer is: Peak detection, alignment, and normalization
Q.52 Which of the following is a characteristic of 'multi‑omics factor analysis' (MOFA) compared to standard PCA?
MOFA can handle multiple data modalities simultaneously, while PCA works on a single matrix
MOFA only works with binary data
MOFA requires no statistical assumptions
MOFA is a type of neural network
Explanation - MOFA extends PCA to jointly model several omics layers, learning shared and view‑specific factors.
Correct answer is: MOFA can handle multiple data modalities simultaneously, while PCA works on a single matrix
Q.53 In the context of systems biology, a 'digital twin' of a cell would most likely require:
Integration of multi‑omics data, kinetic parameters, and computational models to simulate cellular behavior
A physical replica of the cell made from silicon
Only the genome sequence of the organism
A high‑resolution photograph of the cell
Explanation - Digital twins aim to recreate the dynamic state of a biological system using comprehensive data and models.
Correct answer is: Integration of multi‑omics data, kinetic parameters, and computational models to simulate cellular behavior
Q.54 Which of the following normalization methods is commonly used for RNA‑seq count data before integration?
TPM (Transcripts Per Million)
RPKM (Reads Per Kilobase Million)
DESeq2's median of ratios
All of the above
Explanation - TPM, RPKM, and DESeq2's median‑of‑ratios are all accepted normalization strategies for RNA‑seq counts.
Correct answer is: All of the above
Q.55 Which type of graph representation is most appropriate for capturing many‑to‑many relationships between genes, proteins, and metabolites?
Bipartite graph
Tree
Linear chain
Simple undirected graph with no edge labels
Explanation - Bipartite graphs have two node sets (e.g., genes and metabolites) with edges only between sets, fitting many‑to‑many interactions.
Correct answer is: Bipartite graph
Q.56 Which of the following is a key advantage of using cloud‑based Jupyter notebooks for omics integration workflows?
They eliminate the need for any coding
They enable interactive analysis, reproducibility, and sharing of code and results
They guarantee that all results are correct
They replace the need for biological expertise
Explanation - Jupyter notebooks combine code, documentation, and visualizations in a shareable format, facilitating collaborative work.
Correct answer is: They enable interactive analysis, reproducibility, and sharing of code and results
Q.57 In a multi‑omics dataset, the term 'sample' refers to:
A single measured variable like a gene
A biological replicate or individual from which all omics layers were collected
The size of the hard drive
The number of CPU cores used
Explanation - Each sample is a unit (e.g., a patient) for which multiple omics measurements are obtained.
Correct answer is: A biological replicate or individual from which all omics layers were collected
Q.58 Which of the following is a commonly used metric for evaluating the similarity between two omics data matrices?
Jaccard index
Pearson correlation coefficient
Hamming distance
All of the above
Explanation - Depending on data type, any of these similarity metrics may be employed to compare matrices.
Correct answer is: All of the above
Q.59 In a systems‑biology model, the term 'steady‑state' implies:
All concentrations are zero
The net change of each component over time is zero (dX/dt = 0)
The system is increasing exponentially
The temperature of the incubator is constant
Explanation - Steady‑state assumes that, although reactions continue, concentrations remain constant overall.
Correct answer is: The net change of each component over time is zero (dX/dt = 0)
Q.60 Which of the following is a typical outcome of a successful multi‑omics integration study?
Identification of multi‑layer biomarkers predictive of disease outcome
Discovery of a single gene mutation only
Construction of a physical device
Measurement of room temperature
Explanation - Integrative analyses often reveal combined signatures (genes, proteins, metabolites) associated with phenotypes.
Correct answer is: Identification of multi‑layer biomarkers predictive of disease outcome
Q.61 Which of the following is NOT a common dimension‑reduction technique for omics data?
Principal Component Analysis (PCA)
Independent Component Analysis (ICA)
Linear Regression
Uniform Manifold Approximation and Projection (UMAP)
Explanation - Linear regression models relationships; PCA, ICA, and UMAP are used to reduce dimensionality.
Correct answer is: Linear Regression
Q.62 When combining omics data from different species, a common strategy is to:
Ignore orthology and treat each gene as unique
Map genes to orthologous groups or conserved pathways before integration
Convert all data to binary format
Use only the longest chromosome
Explanation - Orthology mapping allows comparable functional units across species to be aligned for integration.
Correct answer is: Map genes to orthologous groups or conserved pathways before integration
Q.63 Which of the following best describes the purpose of a 'reference genome' in omics pipelines?
A template used for aligning sequencing reads and annotating variants
A physical model of a cell
A set of electrical schematics
A statistical test for differential expression
Explanation - Reference genomes provide a coordinate system for mapping reads and interpreting genomic alterations.
Correct answer is: A template used for aligning sequencing reads and annotating variants
Q.64 Which of the following is a common approach to integrate time‑series multi‑omics data?
Dynamic Bayesian networks
Static heatmaps
Simple linear regression on a single time point
K‑means clustering without considering time
Explanation - Dynamic Bayesian networks model temporal dependencies across multiple data types.
Correct answer is: Dynamic Bayesian networks
Q.65 In a multi‑omics workflow, a 'pipeline' most commonly refers to:
A series of computational steps that process raw data into analysis‑ready matrices
A physical tube in the laboratory
A type of electrical transformer
A brand of coffee machine
Explanation - Pipelines automate tasks such as quality control, alignment, quantification, and integration.
Correct answer is: A series of computational steps that process raw data into analysis‑ready matrices
Q.66 Which of the following is a typical use of 'gene set enrichment analysis' (GSEA) after multi‑omics integration?
To identify pathways significantly associated with coordinated changes across layers
To compute the molecular weight of a protein
To design a microchip layout
To measure the voltage of a battery
Explanation - GSEA evaluates whether predefined gene sets show consistent enrichment across integrated data.
Correct answer is: To identify pathways significantly associated with coordinated changes across layers
Q.67 Which of the following best explains why 'log‑transformation' is frequently applied to omics data before integration?
It converts all values to integers
It stabilizes variance and makes data more normally distributed
It removes all missing values
It changes the biological meaning of the data
Explanation - Log transformation compresses large ranges, reduces heteroscedasticity, and facilitates downstream statistical analysis.
Correct answer is: It stabilizes variance and makes data more normally distributed
Q.68 When using a 'random forest' for multi‑omics classification, the 'feature importance' scores help to:
Identify which omics features contribute most to the predictive model
Determine the temperature of the experiment
Calculate the p‑value of each gene
Generate a 3‑D image of a cell
Explanation - Feature importance quantifies each variable's contribution to classification accuracy.
Correct answer is: Identify which omics features contribute most to the predictive model
Q.69 Which of the following is a commonly used metric for evaluating the quality of a multi‑omics clustering solution?
Adjusted Rand Index (ARI)
Ohm's Law
Molar mass
Nucleotide GC content
Explanation - ARI compares the similarity between predicted cluster assignments and a ground‑truth labeling.
Correct answer is: Adjusted Rand Index (ARI)
Q.70 Which of the following is a standard method for integrating metabolomics data with gene expression data to infer active metabolic pathways?
Joint Pathway Analysis in MetaboAnalyst
BLAST alignment
PCR amplification
Northern blotting
Explanation - MetaboAnalyst's Joint Pathway Analysis combines metabolites and gene expression to highlight enriched pathways.
Correct answer is: Joint Pathway Analysis in MetaboAnalyst
Q.71 In multi‑omics studies, the term 'orthogonal' data refers to:
Data that provides complementary information and is statistically independent
Data that is measured on the same platform
Data that has the same units
Data that is irrelevant to the study
Explanation - Orthogonal datasets capture different biological aspects, enabling richer integration.
Correct answer is: Data that provides complementary information and is statistically independent
Q.72 Which of the following is a typical challenge when integrating single‑cell omics data across modalities?
Different cell capture technologies lead to varying cell barcodes and missing modalities for some cells
All cells have identical expression profiles
Sequencing machines cannot read DNA
Proteins are not present in cells
Explanation - Integration must handle missing data and align cells across different measurement platforms.
Correct answer is: Different cell capture technologies lead to varying cell barcodes and missing modalities for some cells
Q.73 Which of the following is an example of a 'multi‑omics' disease biomarker?
A signature comprising a specific gene mutation, its mRNA expression level, the corresponding protein abundance, and a related metabolite concentration
A single blood pressure reading
The color of a petri dish
The brand of a pipette
Explanation - Multi‑omics biomarkers combine information from several molecular layers to improve diagnostic power.
Correct answer is: A signature comprising a specific gene mutation, its mRNA expression level, the corresponding protein abundance, and a related metabolite concentration
Q.74 Which of the following statistical corrections is most appropriate when testing thousands of metabolites for association with a phenotype?
Benjamini‑Hochberg false discovery rate (FDR) control
Bonferroni correction
No correction needed
Z‑score transformation
Explanation - FDR control balances discovery and false positives better than the overly conservative Bonferroni correction for large numbers of tests.
Correct answer is: Benjamini‑Hochberg false discovery rate (FDR) control
Q.75 In a systems‑biology model, 'feedback inhibition' is analogous to which electrical component behavior?
A resistor that reduces current as voltage increases
A capacitor that stores charge
An amplifier that boosts signal strength
A diode that allows current in one direction
Explanation - Feedback inhibition diminishes pathway flux, similar to how a resistor limits current flow.
Correct answer is: A resistor that reduces current as voltage increases
Q.76 Which of the following is a common method for integrating heterogeneous omics data into a single predictive model?
Multi‑view deep learning (e.g., autoencoders for each omics type concatenated downstream)
Single‑gene PCR
Gel electrophoresis
Manual counting of cells
Explanation - Multi‑view deep learning learns modality‑specific representations before merging them for prediction.
Correct answer is: Multi‑view deep learning (e.g., autoencoders for each omics type concatenated downstream)
Q.77 Which of the following best defines a 'metabolite‑protein interaction database'?
A curated collection of experimentally verified links between metabolites and the proteins that bind or transform them
A list of DNA sequences
A spreadsheet of weather data
A catalog of electrical resistors
Explanation - Such databases (e.g., STITCH) enable linking metabolomics data with proteomics for integrated analysis.
Correct answer is: A curated collection of experimentally verified links between metabolites and the proteins that bind or transform them
Q.78 Which of the following is a typical output of a successful multi‑omics integration pipeline?
A ranked list of multi‑layer features (genes, proteins, metabolites) associated with a clinical phenotype
A single nucleotide sequence of an unknown virus
A photograph of a lab bench
A list of coffee brands
Explanation - Integration aims to uncover biologically relevant, cross‑modal signatures linked to phenotypes.
Correct answer is: A ranked list of multi‑layer features (genes, proteins, metabolites) associated with a clinical phenotype
Q.79 When visualizing integrated multi‑omics data on a 2‑D scatter plot, which technique is most suitable for preserving both local and global structure?
Uniform Manifold Approximation and Projection (UMAP)
Principal Component Analysis (PCA)
Histogram
Box plot
Explanation - UMAP captures both local neighborhoods and global arrangement better than many other linear methods.
Correct answer is: Uniform Manifold Approximation and Projection (UMAP)
Q.80 Which of the following is a widely used statistical framework for integrating multiple quantitative traits (e.g., multi‑omics phenotypes) in genome‑wide association studies?
Multi‑Trait Mixed Model (MTMM)
Simple linear regression
Chi‑square goodness of fit
Fourier Transform
Explanation - MTMM models multiple correlated traits simultaneously, increasing power to detect shared genetic effects.
Correct answer is: Multi‑Trait Mixed Model (MTMM)
Q.81 In a systems biology context, the term 'modular' most accurately describes:
A network organization where groups of tightly connected nodes (modules) perform distinct functional tasks
A single gene without any interactions
A random assortment of proteins
An electrical circuit with no resistors
Explanation - Modularity reflects functional segregation, facilitating analysis of complex biological networks.
Correct answer is: A network organization where groups of tightly connected nodes (modules) perform distinct functional tasks
Q.82 Which of the following is a major benefit of using 'graph neural networks' (GNNs) for multi‑omics integration?
They can directly incorporate network topology and node features from different omics layers into a unified predictive model
They replace the need for any experimental data
They guarantee 100 % accuracy
They only work with image data
Explanation - GNNs learn representations that respect the structure of biological interaction graphs while integrating multi‑omics attributes.
Correct answer is: They can directly incorporate network topology and node features from different omics layers into a unified predictive model
Q.83 Which of the following is an example of a 'knowledge graph' used in omics integration?
A graph linking genes, proteins, metabolites, diseases, and drugs based on curated literature and databases
A spreadsheet of lab inventory
A list of electricity tariffs
A simple line plot of temperature over time
Explanation - Knowledge graphs encode heterogeneous biomedical entities and their relationships, facilitating integrative queries.
Correct answer is: A graph linking genes, proteins, metabolites, diseases, and drugs based on curated literature and databases
Q.84 When performing multi‑omics integration, the term 'latent space' refers to:
A reduced‑dimensional representation where the most informative patterns across datasets are captured
The physical lab space where samples are stored
The size of the hard drive
The number of technicians in the lab
Explanation - Latent space embeddings (e.g., from MOFA, autoencoders) summarize high‑dimensional data in a compact form.
Correct answer is: A reduced‑dimensional representation where the most informative patterns across datasets are captured
Q.85 Which of the following best describes the role of 'cross‑validation' in evaluating a multi‑omics predictive model?
It partitions the data into training and testing subsets multiple times to assess model generalizability
It normalizes the raw data
It measures the voltage of a circuit
It changes the sample labels randomly
Explanation - Cross‑validation helps avoid over‑fitting by testing model performance on unseen data folds.
Correct answer is: It partitions the data into training and testing subsets multiple times to assess model generalizability
Q.86 Which of the following is a standard file format for storing gene expression matrices used in many integration tools?
CSV (Comma‑Separated Values)
FASTQ
BAM
VCF
Explanation - CSV files are simple, human‑readable tables suitable for expression matrices; other formats are for raw sequencing reads.
Correct answer is: CSV (Comma‑Separated Values)
Q.87 In a multi‑omics experiment, the term 'paired samples' typically means:
Data from the same biological source measured across several omics platforms
Two unrelated individuals
Samples taken from different species
Samples measured on different days
Explanation - Paired samples ensure that each omics measurement corresponds to the same individual or tissue.
Correct answer is: Data from the same biological source measured across several omics platforms
Q.88 Which of the following approaches can be used to assess causality between a genetic variant and a downstream metabolite level?
Mendelian Randomization
Pearson correlation
Principal Component Analysis
t‑test
Explanation - Mendelian Randomization leverages genetic variants as instrumental variables to infer causal effects on traits such as metabolites.
Correct answer is: Mendelian Randomization
Q.89 Which of the following is an advantage of using 'ensemble methods' (e.g., stacking) for multi‑omics prediction?
They combine predictions from multiple models to improve overall accuracy and robustness
They require only a single data type
They eliminate the need for any preprocessing
They guarantee a perfect model
Explanation - Ensembles aggregate diverse model strengths, often outperforming any single method.
Correct answer is: They combine predictions from multiple models to improve overall accuracy and robustness
Q.90 Which of the following describes a 'canonical pathway' in the context of omics integration?
A well‑characterized, widely accepted biochemical pathway used as a reference for mapping omics data
A random set of genes with no known function
A type of electrical circuit
A list of laboratory equipment
Explanation - Canonical pathways (e.g., from KEGG or Reactome) serve as standardized maps for overlaying multi‑omics measurements.
Correct answer is: A well‑characterized, widely accepted biochemical pathway used as a reference for mapping omics data
Q.91 Which of the following is a typical method for reducing the dimensionality of a metabolomics matrix before integration with other omics?
Partial Least Squares Discriminant Analysis (PLS‑DA)
BLAST
PCR
Western blot
Explanation - PLS‑DA finds components that maximize class separation while handling correlated variables, useful for metabolomics.
Correct answer is: Partial Least Squares Discriminant Analysis (PLS‑DA)
Q.92 Which of the following best illustrates the concept of 'cross‑platform validation' in multi‑omics studies?
Confirming a biomarker discovered in transcriptomics by measuring the corresponding protein and metabolite in independent cohorts
Running the same sequencing instrument twice
Measuring the same sample with only one omics technology
Changing the brand of pipette tips
Explanation - Cross‑platform validation ensures that findings are robust across different measurement technologies and sample sets.
Correct answer is: Confirming a biomarker discovered in transcriptomics by measuring the corresponding protein and metabolite in independent cohorts
Q.93 Which of the following is a key reason to integrate epigenomics (e.g., DNA methylation) with transcriptomics?
To uncover regulatory mechanisms where epigenetic modifications influence gene expression
Because both datasets have the same file format
To increase the size of the data matrix without adding new information
Because DNA methylation directly measures protein concentration
Explanation - Epigenetic marks can activate or silence genes, so joint analysis reveals functional regulation.
Correct answer is: To uncover regulatory mechanisms where epigenetic modifications influence gene expression
Q.94 In a multi‑omics integration pipeline, the step that aligns features across datasets based on gene identifiers is known as:
Feature matching or mapping
Peak calling
Sequencing
Electrophoresis
Explanation - Feature matching ensures that the same biological entity (e.g., a gene) is consistently referenced across omics layers.
Correct answer is: Feature matching or mapping
Q.95 Which of the following statements about 'batch effect correction' is TRUE?
Methods like ComBat adjust data to reduce systematic differences between batches while preserving biological variation
Batch correction always removes all biological signals
Batch effects are only a concern for proteomics data
Batch correction can be ignored if the sample size is large
Explanation - ComBat and similar algorithms model batch‑specific effects and adjust the data accordingly.
Correct answer is: Methods like ComBat adjust data to reduce systematic differences between batches while preserving biological variation
Q.96 Which of the following is a common way to represent the relationship between a set of genes and the metabolites they affect in an integrated network?
A bipartite graph linking gene nodes to metabolite nodes via enzyme‑mediated edges
A simple list of gene names
A spreadsheet of temperature readings
A binary tree of hardware components
Explanation - Bipartite graphs capture the two‑type relationship (genes ↔ metabolites) mediated by enzymes.
Correct answer is: A bipartite graph linking gene nodes to metabolite nodes via enzyme‑mediated edges
Q.97 Which of the following tools is specifically designed for integrative analysis of multi‑omics time‑course data?
DyNB (Dynamic Bayesian Network)
Bowtie
FastQC
Trimmomatic
Explanation - DyNB models temporal dependencies across multiple data types, making it suitable for time‑course integration.
Correct answer is: DyNB (Dynamic Bayesian Network)
Q.98 When constructing a multi‑omics predictive model, why might one choose a 'regularized' regression method (e.g., Lasso) over ordinary least squares?
Regularization penalizes large coefficients, reducing over‑fitting in high‑dimensional data
Lasso guarantees a perfect fit
Regularized methods are faster for small datasets
Ordinary least squares does not work with any biological data
Explanation - Lasso (L1) adds a penalty that shrinks many coefficients to zero, aiding feature selection in omics data.
Correct answer is: Regularization penalizes large coefficients, reducing over‑fitting in high‑dimensional data
Q.99 Which of the following best describes 'data fusion' in the context of omics integration?
Combining multiple data sources at the raw or feature level to produce a unified representation for analysis
Physically mixing DNA, RNA, and protein samples in a tube
Running a single‑omics experiment multiple times
Measuring the same sample with the same technology
Explanation - Data fusion integrates heterogeneous datasets to leverage complementary information.
Correct answer is: Combining multiple data sources at the raw or feature level to produce a unified representation for analysis
Q.100 Which of the following is a major reason to use 'graph‑based regularization' when learning from multi‑omics networks?
It encourages smoothness of model parameters across neighboring nodes, reflecting biological connectivity
It speeds up the hardware clock speed
It removes the need for any data preprocessing
It converts all data to grayscale images
Explanation - Graph regularization leverages known interaction networks to bias learning towards biologically plausible solutions.
Correct answer is: It encourages smoothness of model parameters across neighboring nodes, reflecting biological connectivity
Q.101 Which of the following statements about 'omics data dimensionality' is correct?
Omics datasets often contain far more features (genes, proteins, metabolites) than samples, leading to a high‑dimensional, low‑sample‑size problem
Omics data always have more samples than features
Dimensionality is not a concern for any statistical method
High dimensionality guarantees better model performance
Explanation - This imbalance necessitates dimensionality reduction or regularization to avoid over‑fitting.
Correct answer is: Omics datasets often contain far more features (genes, proteins, metabolites) than samples, leading to a high‑dimensional, low‑sample‑size problem
Q.102 In an integrated omics study, the term 'phenotype' most commonly refers to:
A measurable trait such as disease status, drug response, or physiological condition
The name of the laboratory
The brand of the sequencing platform
The color of the lab coat
Explanation - Phenotypes are the outcomes that omics data aim to explain or predict.
Correct answer is: A measurable trait such as disease status, drug response, or physiological condition
Q.103 Which of the following is a typical way to evaluate whether a multi‑omics integration has improved prediction of a clinical outcome?
Compare AUROC, accuracy, or other performance metrics of models built with single‑omics versus multi‑omics inputs
Count the number of files in the project folder
Measure the voltage of the computer used
Check the spelling of gene symbols
Explanation - Performance comparison quantifies the added value of integrating multiple data types.
Correct answer is: Compare AUROC, accuracy, or other performance metrics of models built with single‑omics versus multi‑omics inputs
Q.104 Which of the following best describes a 'sparse' representation in multi‑omics data analysis?
A matrix where many entries are zero because only a subset of features are selected as informative
A matrix filled entirely with random numbers
A full dense matrix with no zeros
A picture of a sparsely populated city
Explanation - Sparsity aids interpretability and reduces over‑fitting by focusing on a limited number of key biomarkers.
Correct answer is: A matrix where many entries are zero because only a subset of features are selected as informative
Q.105 Which of the following is a commonly used resource for mapping human metabolites to their corresponding enzymes?
HMDB (Human Metabolome Database)
GenBank
PDB
Ensembl
Explanation - HMDB provides detailed metabolite information, including enzyme links and pathway context.
Correct answer is: HMDB (Human Metabolome Database)
Q.106 When integrating proteomics data, why is 'label‑free quantification' sometimes preferred over label‑based methods?
It avoids the cost and complexity of introducing stable isotopes, allowing analysis of a larger number of samples
It always provides higher accuracy than any label‑based method
It does not require a mass spectrometer
It converts proteins into DNA
Explanation - Label‑free approaches rely on peptide intensity or spectral counting, making them scalable for large cohorts.
Correct answer is: It avoids the cost and complexity of introducing stable isotopes, allowing analysis of a larger number of samples
Q.107 Which of the following is a key principle behind the 'central dogma' that underlies many omics studies?
DNA → RNA → Protein → Metabolite
Protein → DNA → RNA → Metabolite
Metabolite → Protein → DNA → RNA
RNA → DNA → Protein → Metabolite
Explanation - The central dogma describes the flow of genetic information from nucleic acids to proteins and downstream metabolites.
Correct answer is: DNA → RNA → Protein → Metabolite
Q.108 In a multi‑omics network, an edge annotated with a 'confidence score' of 0.9 most likely indicates:
High reliability of the interaction based on experimental evidence
That the edge is present in 90 % of all species
That the edge has a voltage of 0.9 V
That the edge connects exactly nine nodes
Explanation - Confidence scores quantify the strength or reliability of reported interactions.
Correct answer is: High reliability of the interaction based on experimental evidence
Q.109 Which of the following is a primary goal of 'systems pharmacology' that utilizes multi‑omics integration?
To predict drug response and identify therapeutic targets by integrating genomic, transcriptomic, proteomic, and metabolomic data
To design a new type of electrical resistor
To measure the temperature of a reaction
To count the number of cells under a microscope
Explanation - Systems pharmacology leverages integrated omics to understand drug mechanisms and personalize therapy.
Correct answer is: To predict drug response and identify therapeutic targets by integrating genomic, transcriptomic, proteomic, and metabolomic data
Q.110 Which of the following is a typical output format for a multi‑omics network that can be imported into Cytoscape?
XGMML (eXtensible Graph Markup and Modeling Language)
FASTQ
BED
VCF
Explanation - XGMML is an XML‑based format recognized by Cytoscape for graph data import.
Correct answer is: XGMML (eXtensible Graph Markup and Modeling Language)
Q.111 Which of the following statistical techniques can be used to test whether a set of genes is over‑represented in a list of differentially expressed genes obtained from multi‑omics integration?
Hypergeometric test
Fourier analysis
Monte Carlo simulation of electrical circuits
Linear regression of temperature over time
Explanation - The hypergeometric test assesses enrichment of a gene set relative to a background population.
Correct answer is: Hypergeometric test
Q.112 Which of the following best describes the concept of 'horizontal integration' in omics data analysis?
Combining the same type of omics data (e.g., transcriptomics) from multiple studies or cohorts
Merging different omics types from the same study
Integrating electrical circuit diagrams with biological pathways
Measuring the same sample repeatedly with the same instrument
Explanation - Horizontal integration expands sample size and statistical power by pooling comparable datasets.
Correct answer is: Combining the same type of omics data (e.g., transcriptomics) from multiple studies or cohorts
Q.113 Which of the following is an example of a 'latent variable model' used for multi‑omics data?
MOFA (Multi‑omics Factor Analysis)
PCR amplification
Western blot
Sanger sequencing
Explanation - MOFA models hidden factors that explain variation across multiple omics layers.
Correct answer is: MOFA (Multi‑omics Factor Analysis)
Q.114 When visualizing a multi‑omics network, what does the node size commonly encode?
The magnitude of a variable (e.g., expression level, abundance) associated with that node
The physical size of the lab bench
The voltage of an electrical component
The number of pages in a protocol
Explanation - Scaling node size by a quantitative attribute helps convey biologically relevant information.
Correct answer is: The magnitude of a variable (e.g., expression level, abundance) associated with that node
Q.115 Which of the following is a common strategy for handling missing values in metabolomics before integration?
Impute with half of the minimum detected value (half‑min approach)
Replace with zero for all metabolites
Delete the entire sample
Ignore the missing entries during analysis
Explanation - Half‑min imputation approximates values below detection limits while preserving distributional properties.
Correct answer is: Impute with half of the minimum detected value (half‑min approach)
Q.116 Which of the following best illustrates a 'multi‑omics' data matrix for a set of patients?
A block matrix where each block corresponds to a different omics layer (genes, proteins, metabolites) aligned by patient rows
A single column of patient names
A picture of a microscope
A list of lab equipment
Explanation - The block structure maintains modality separation while keeping patient alignment across layers.
Correct answer is: A block matrix where each block corresponds to a different omics layer (genes, proteins, metabolites) aligned by patient rows
Q.117 Which of the following is a major benefit of integrating proteomics with transcriptomics for understanding gene regulation?
It enables assessment of post‑transcriptional regulation by comparing mRNA levels with protein abundances
It eliminates the need for DNA sequencing
It automatically predicts metabolic fluxes
It measures the pH of the culture medium
Explanation - Discrepancies between mRNA and protein levels reveal regulatory mechanisms such as translation efficiency or protein stability.
Correct answer is: It enables assessment of post‑transcriptional regulation by comparing mRNA levels with protein abundances
Q.118 Which of the following is a commonly used technique to visualize the relationship between two omics layers (e.g., gene expression vs. metabolite abundance) across samples?
Scatter plot with each point representing a sample, colored by phenotype
Bar chart of a single gene
Pie chart of metabolite classes
Line graph of instrument temperature
Explanation - Scatter plots display pairwise relationships and can reveal correlations or patterns linked to phenotypes.
Correct answer is: Scatter plot with each point representing a sample, colored by phenotype
Q.119 Which of the following best captures the purpose of the 'integration step' in a multi‑omics pipeline?
Combining processed data from multiple omics platforms into a single analytical framework
Sequencing the genome for the first time
Running a Western blot experiment
Measuring the humidity in the lab
Explanation - Integration synthesizes diverse molecular measurements to enable joint analysis.
Correct answer is: Combining processed data from multiple omics platforms into a single analytical framework
Q.120 When applying 'regularized canonical correlation analysis' (rCCA) to multi‑omics data, what does the regularization term help to achieve?
It stabilizes the solution when the number of features exceeds the number of samples, preventing over‑fitting
It converts all data to binary format
It removes the need for any preprocessing
It guarantees perfect correlation between datasets
Explanation - Regularization penalizes large weights, making CCA feasible in high‑dimensional settings typical of omics data.
Correct answer is: It stabilizes the solution when the number of features exceeds the number of samples, preventing over‑fitting
Q.121 Which of the following is a typical challenge when integrating single‑cell RNA‑seq with bulk proteomics data?
Differing resolution: single‑cell data capture cell‑to‑cell variability, whereas bulk proteomics provides averaged measurements
Both datasets have identical data structures
The same number of features is always present in both
There is no need for any normalization
Explanation - Resolution mismatch requires careful modeling (e.g., deconvolution) to align the datasets.
Correct answer is: Differing resolution: single‑cell data capture cell‑to‑cell variability, whereas bulk proteomics provides averaged measurements
Q.122 Which of the following best explains why 'graph embeddings' are useful for multi‑omics data integration?
They convert nodes and edges of biological networks into low‑dimensional vectors that can be fed into machine‑learning models
They increase the file size of the dataset
They transform all data into images
They replace the need for any statistical analysis
Explanation - Graph embeddings preserve network topology while providing a compact numeric representation suitable for downstream tasks.
Correct answer is: They convert nodes and edges of biological networks into low‑dimensional vectors that can be fed into machine‑learning models
Q.123 Which of the following is a common approach for integrating DNA methylation and gene expression data to identify epigenetically regulated genes?
Correlation analysis between promoter methylation beta values and gene expression levels, followed by significance testing
Counting the number of nucleotides in the genome
Measuring the pH of the extraction buffer
Running a gel electrophoresis of DNA
Explanation - Negative correlations suggest methylation‑mediated repression of transcription.
Correct answer is: Correlation analysis between promoter methylation beta values and gene expression levels, followed by significance testing
Q.124 Which of the following best describes the 'curse of dimensionality' in the context of omics data?
As the number of features grows, the volume of the feature space increases exponentially, making data sparse and statistical inference challenging
More dimensions always improve model performance
It refers to a hardware limitation of electrical circuits
It is a term used for low‑resolution images
Explanation - High dimensionality leads to over‑fitting and requires dimensionality reduction or regularization.
Correct answer is: As the number of features grows, the volume of the feature space increases exponentially, making data sparse and statistical inference challenging
Q.125 Which of the following is a standard step to ensure reproducibility of a multi‑omics integration workflow?
Documenting all software versions, parameters, and using workflow management systems (e.g., Snakemake, Nextflow)
Keeping all scripts on a personal notebook without backup
Changing variable names arbitrarily
Running the analysis only once
Explanation - Version control and workflow managers capture the exact steps, enabling others to reproduce the analysis.
Correct answer is: Documenting all software versions, parameters, and using workflow management systems (e.g., Snakemake, Nextflow)
Q.126 Which of the following techniques can be used to integrate spatial transcriptomics with metabolomics data from tissue sections?
Spatially resolved data fusion using joint matrix factorization aligned by spatial coordinates
Standard PCR amplification
Bulk RNA extraction only
Measuring temperature gradients across the tissue
Explanation - Joint factorization leverages the shared spatial layout to combine molecular layers at each tissue location.
Correct answer is: Spatially resolved data fusion using joint matrix factorization aligned by spatial coordinates
