Computer Vision # MCQs Practice set

Q.1 What is the primary goal of computer vision?

To simulate human thinking
To enable machines to understand visual data
To improve text processing
To generate random images
Explanation - Computer vision focuses on enabling machines to interpret and understand visual data like images and videos.
Correct answer is: To enable machines to understand visual data

Q.2 Which of the following is a typical application of computer vision?

Spam email filtering
Face recognition
Text translation
Audio synthesis
Explanation - Face recognition is a direct application of computer vision where systems identify or verify a person’s face in images or videos.
Correct answer is: Face recognition

Q.3 Which type of data is mainly used in computer vision?

Audio signals
Image and video data
Numerical tabular data
Text data
Explanation - Computer vision specifically deals with image and video data, unlike NLP or speech processing domains.
Correct answer is: Image and video data

Q.4 Which algorithm is most commonly used in modern computer vision tasks?

Decision Trees
Support Vector Machines
Convolutional Neural Networks
K-Means Clustering
Explanation - CNNs are widely used in computer vision because they are effective at extracting spatial features from images.
Correct answer is: Convolutional Neural Networks

Q.5 Edge detection in images is primarily used to:

Remove noise
Highlight object boundaries
Convert color images to grayscale
Enhance image brightness
Explanation - Edge detection identifies sudden changes in pixel intensity, which usually correspond to object boundaries.
Correct answer is: Highlight object boundaries

Q.6 Which technique reduces image size while retaining key information?

Upsampling
Pooling
Normalization
Augmentation
Explanation - Pooling layers in CNNs reduce the spatial dimensions of images, keeping essential features while reducing computational complexity.
Correct answer is: Pooling

Q.7 Which dataset is widely used for handwritten digit recognition?

ImageNet
CIFAR-10
MNIST
COCO
Explanation - The MNIST dataset is a benchmark dataset of handwritten digits, commonly used for training computer vision models.
Correct answer is: MNIST

Q.8 Image classification refers to:

Assigning a label to an entire image
Finding the location of objects in an image
Separating background from objects
Restoring corrupted images
Explanation - Image classification tasks involve predicting the category or label of the entire image, unlike detection or segmentation.
Correct answer is: Assigning a label to an entire image

Q.9 Object detection differs from image classification by:

Detecting objects without labels
Providing both object labels and their locations
Working only on grayscale images
Focusing only on image enhancement
Explanation - Unlike classification, object detection finds objects in an image and provides bounding boxes along with labels.
Correct answer is: Providing both object labels and their locations

Q.10 Which of these is NOT an application of computer vision?

Autonomous driving
Medical image analysis
Text summarization
Security surveillance
Explanation - Text summarization is an NLP task, not computer vision. Computer vision focuses on analyzing visual data.
Correct answer is: Text summarization

Q.11 Which color model is most commonly used in computer vision?

CMYK
RGB
HSV
XYZ
Explanation - The RGB color model is the most widely used for representing digital images in computer vision.
Correct answer is: RGB

Q.12 Which type of neural network is best suited for sequential image data like videos?

Convolutional Neural Networks
Recurrent Neural Networks
Autoencoders
Generative Adversarial Networks
Explanation - RNNs (often combined with CNNs) are suited for sequential data like videos where time dependencies matter.
Correct answer is: Recurrent Neural Networks

Q.13 Which preprocessing step helps improve model performance in image classification?

Image normalization
Word tokenization
Feature hashing
Stopword removal
Explanation - Normalization scales pixel values, which improves convergence during training of vision models.
Correct answer is: Image normalization

Q.14 Which type of problem does semantic segmentation solve?

Image-level labeling
Pixel-level labeling
Video frame prediction
Noise removal
Explanation - Semantic segmentation assigns a label to each pixel, distinguishing object boundaries in detail.
Correct answer is: Pixel-level labeling

Q.15 What does 'ImageNet' provide for computer vision research?

Audio samples
Large-scale labeled images
Text corpus
Speech datasets
Explanation - ImageNet is a large dataset of labeled images that is widely used for benchmarking computer vision models.
Correct answer is: Large-scale labeled images

Q.16 What is the role of convolution in CNNs?

To reduce dataset size
To extract features from images
To randomize pixels
To increase resolution
Explanation - Convolution operations apply filters that detect local patterns like edges, textures, or shapes in images.
Correct answer is: To extract features from images

Q.17 Which technique helps prevent overfitting in CNNs?

Dropout
Pooling
Batch normalization
Clustering
Explanation - Dropout randomly deactivates neurons during training, reducing overfitting in CNNs.
Correct answer is: Dropout

Q.18 Which of the following is an image augmentation technique?

Rotation
Stopword removal
Tokenization
Normalization of text
Explanation - Rotation is an image augmentation method used to artificially increase dataset size and variability.
Correct answer is: Rotation

Q.19 Optical Character Recognition (OCR) is used to:

Detect objects in images
Convert text in images into editable text
Translate spoken words
Detect emotions in faces
Explanation - OCR extracts text from images, enabling editable and searchable formats.
Correct answer is: Convert text in images into editable text

Q.20 Which filter highlights edges in an image?

Gaussian blur
Sobel filter
Mean filter
Median filter
Explanation - The Sobel filter emphasizes edges by detecting intensity gradients in an image.
Correct answer is: Sobel filter

Q.21 Which of these architectures is most commonly used for real-time object detection?

YOLO
RNN
KNN
Naive Bayes
Explanation - YOLO (You Only Look Once) is a popular CNN-based architecture for real-time object detection.
Correct answer is: YOLO

Q.22 What does a bounding box represent in object detection?

Region containing an object
Background area of image
Noise removal filter
Color histogram
Explanation - Bounding boxes are rectangular regions that specify the position of detected objects in images.
Correct answer is: Region containing an object

Q.23 In CNNs, 'stride' refers to:

The step size of the filter movement
The number of hidden layers
The dropout probability
The image resolution
Explanation - Stride determines how far the filter moves across the image during convolution.
Correct answer is: The step size of the filter movement

Q.24 Which of the following is a challenge in computer vision?

Variations in lighting and pose
High availability of labeled data
Easy interpretability of models
Unlimited computational resources
Explanation - Changes in lighting, scale, and pose make it difficult for vision systems to recognize objects consistently.
Correct answer is: Variations in lighting and pose

Q.25 Which deep learning technique is used for generating realistic images?

GANs
RNNs
Autoencoders
SVMs
Explanation - Generative Adversarial Networks (GANs) are used for generating realistic images by training two competing networks.
Correct answer is: GANs