GPU and Multicore Architectures # MCQs Practice set

Q.1 What is the primary purpose of a GPU in modern computing?

To perform general-purpose calculations
To handle graphics rendering and parallel processing
To manage input-output operations
To store large amounts of data
Explanation - GPUs are specialized for graphics rendering and highly parallelizable computations, unlike CPUs which are optimized for sequential tasks.
Correct answer is: To handle graphics rendering and parallel processing

Q.2 Which of the following best describes SIMD in GPU architecture?

Single Instruction, Multiple Data
Simple Instruction, Multiple Data
Single Instruction, Multi-threaded Data
Sequential Instruction, Multi-core Data
Explanation - SIMD allows a single instruction to be executed on multiple data points simultaneously, which is a fundamental principle of GPU parallelism.
Correct answer is: Single Instruction, Multiple Data

Q.3 What is the main difference between a CPU and a GPU?

CPUs have more cores than GPUs
GPUs are optimized for sequential processing
GPUs have thousands of simpler cores optimized for parallel processing
CPUs cannot execute floating-point operations
Explanation - GPUs contain many small cores to handle parallel workloads, whereas CPUs have fewer, more complex cores for general-purpose tasks.
Correct answer is: GPUs have thousands of simpler cores optimized for parallel processing

Q.4 Which memory type in GPU is the fastest and closest to the cores?

Global memory
Shared memory
Texture memory
Registers
Explanation - Registers are private to each core and provide the fastest access, followed by shared memory, then global memory.
Correct answer is: Registers

Q.5 What does 'warp' refer to in NVIDIA GPU architecture?

A single thread
A group of threads executing the same instruction
A GPU core
A memory block
Explanation - In NVIDIA GPUs, a warp is a group of 32 threads that execute instructions in lockstep on the same instruction.
Correct answer is: A group of threads executing the same instruction

Q.6 Which scheduling model is commonly used in multicore CPUs to achieve parallelism?

Round-robin
Out-of-order execution
Simultaneous multithreading (SMT)
All of the above
Explanation - Multicore CPUs use various scheduling techniques like round-robin, out-of-order execution, and SMT to maximize parallelism and core utilization.
Correct answer is: All of the above

Q.7 Which of the following is a bottleneck in GPU performance?

Compute cores
Memory bandwidth
Instruction set
Registers
Explanation - GPUs often have high computational throughput but are limited by memory bandwidth when accessing large datasets.
Correct answer is: Memory bandwidth

Q.8 What is a key feature of multicore processor architecture?

Multiple independent execution units on a single chip
Single-threaded execution
Dedicated graphics processing only
No cache memory
Explanation - Multicore processors integrate multiple cores on a single chip, enabling parallel execution of tasks to improve performance.
Correct answer is: Multiple independent execution units on a single chip

Q.9 Which type of parallelism is mainly exploited by GPUs?

Instruction-level parallelism
Thread-level parallelism
Task-level parallelism
Data-level parallelism
Explanation - GPUs are highly optimized for executing the same operation on multiple data elements simultaneously (data-level parallelism).
Correct answer is: Data-level parallelism

Q.10 What is the main challenge in programming multicore processors?

Limited memory
Synchronization and data sharing between cores
Lack of compiler support
No instruction pipelining
Explanation - Programming multicore systems requires careful handling of data dependencies, synchronization, and avoiding race conditions.
Correct answer is: Synchronization and data sharing between cores

Q.11 Which API is commonly used for GPU programming?

OpenGL
CUDA
DirectX
Vulkan
Explanation - CUDA is NVIDIA’s parallel computing platform and API specifically designed for programming GPUs.
Correct answer is: CUDA

Q.12 In a multicore processor, what is cache coherence?

Keeping all caches identical
Ensuring all cores see a consistent view of memory
Preventing cache overflows
Sharing registers among cores
Explanation - Cache coherence protocols ensure that updates to memory in one core are visible to other cores, maintaining consistency.
Correct answer is: Ensuring all cores see a consistent view of memory

Q.13 Which GPU memory is shared among threads in a block?

Global memory
Texture memory
Shared memory
Local memory
Explanation - Shared memory is a fast, on-chip memory accessible to all threads within a block, allowing efficient communication.
Correct answer is: Shared memory

Q.14 What is the purpose of the ALU in GPU cores?

Memory storage
Arithmetic and logical computations
Thread scheduling
Cache management
Explanation - The ALU (Arithmetic Logic Unit) performs mathematical and logical operations on data processed by GPU cores.
Correct answer is: Arithmetic and logical computations

Q.15 What does 'heterogeneous computing' refer to?

Using multiple CPUs only
Using CPUs and GPUs together to perform computation
Using multiple GPUs only
Using CPUs sequentially
Explanation - Heterogeneous computing combines the strengths of CPUs (general-purpose) and GPUs (parallel computation) for improved performance.
Correct answer is: Using CPUs and GPUs together to perform computation

Q.16 What is Amdahl’s Law used for?

Estimating power consumption
Predicting the speedup of parallel systems
Calculating GPU memory bandwidth
Designing instruction sets
Explanation - Amdahl’s Law estimates the maximum expected improvement in performance when only part of a system is parallelized.
Correct answer is: Predicting the speedup of parallel systems

Q.17 Which instruction execution model is common in GPUs?

Out-of-order execution
In-order SIMD execution
Speculative execution
Superscalar execution
Explanation - GPUs typically use in-order execution within SIMD units, allowing multiple threads to process data in parallel efficiently.
Correct answer is: In-order SIMD execution

Q.18 Which of the following improves GPU throughput?

Increasing the number of ALUs
Reducing memory latency
Using multiple warps
All of the above
Explanation - GPU throughput can be increased by adding more computational units, reducing memory latency, and efficiently managing warps.
Correct answer is: All of the above

Q.19 What is a thread block in CUDA programming?

A single GPU core
A group of threads executed together on a multiprocessor
A memory segment
A cache line
Explanation - Thread blocks are the basic unit of execution in CUDA, consisting of multiple threads scheduled on a streaming multiprocessor.
Correct answer is: A group of threads executed together on a multiprocessor

Q.20 Which type of parallelism is more limited in CPUs than GPUs?

Thread-level parallelism
Instruction-level parallelism
Data-level parallelism
Task-level parallelism
Explanation - CPUs excel in instruction-level and task-level parallelism but have fewer cores, limiting massive data-level parallelism compared to GPUs.
Correct answer is: Data-level parallelism

Q.21 Which factor is critical when scaling multicore systems?

Cache coherence
Memory latency
Interconnect bandwidth
All of the above
Explanation - Scaling multicore systems requires addressing cache coherence, reducing memory latency, and providing sufficient interconnect bandwidth.
Correct answer is: All of the above

Q.22 Which GPU architecture component schedules warps?

SM (Streaming Multiprocessor)
ALU
Register file
Texture unit
Explanation - The Streaming Multiprocessor (SM) schedules and manages warps, executing instructions on available cores efficiently.
Correct answer is: SM (Streaming Multiprocessor)

Q.23 What is the benefit of hyper-threading in multicore CPUs?

It reduces memory usage
It allows multiple threads per core for better utilization
It increases core count physically
It improves GPU performance
Explanation - Hyper-threading enables a single CPU core to execute multiple threads, improving utilization and throughput without increasing core count.
Correct answer is: It allows multiple threads per core for better utilization

Q.24 Which of the following best describes 'coalesced memory access' in GPUs?

Each thread accesses a random memory location
Memory accesses from threads in a warp are combined for efficiency
Memory is accessed sequentially by the CPU
Registers are shared between threads
Explanation - Coalesced memory access reduces latency and increases bandwidth by combining accesses from multiple threads into a single memory transaction.
Correct answer is: Memory accesses from threads in a warp are combined for efficiency