Computer Vision # MCQs Practice set

Q.1 What is the primary goal of computer vision?

To simulate human thinking

To enable machines to understand visual data

To improve text processing

To generate random images

Explanation - Computer vision focuses on enabling machines to interpret and understand visual data like images and videos.

Correct answer is: To enable machines to understand visual data

Q.2 Which of the following is a typical application of computer vision?

Spam email filtering

Face recognition

Text translation

Audio synthesis

Explanation - Face recognition is a direct application of computer vision where systems identify or verify a person’s face in images or videos.

Correct answer is: Face recognition

Q.3 Which type of data is mainly used in computer vision?

Audio signals

Image and video data

Numerical tabular data

Text data

Explanation - Computer vision specifically deals with image and video data, unlike NLP or speech processing domains.

Correct answer is: Image and video data

Q.4 Which algorithm is most commonly used in modern computer vision tasks?

Decision Trees

Support Vector Machines

Convolutional Neural Networks

K-Means Clustering

Explanation - CNNs are widely used in computer vision because they are effective at extracting spatial features from images.

Correct answer is: Convolutional Neural Networks

Q.5 Edge detection in images is primarily used to:

Remove noise

Highlight object boundaries

Convert color images to grayscale

Enhance image brightness

Explanation - Edge detection identifies sudden changes in pixel intensity, which usually correspond to object boundaries.

Correct answer is: Highlight object boundaries

Q.6 Which technique reduces image size while retaining key information?

Upsampling

Pooling

Normalization

Augmentation

Explanation - Pooling layers in CNNs reduce the spatial dimensions of images, keeping essential features while reducing computational complexity.

Correct answer is: Pooling

Q.7 Which dataset is widely used for handwritten digit recognition?

ImageNet

CIFAR-10

MNIST

COCO

Explanation - The MNIST dataset is a benchmark dataset of handwritten digits, commonly used for training computer vision models.

Correct answer is: MNIST

Q.8 Image classification refers to:

Assigning a label to an entire image

Finding the location of objects in an image

Separating background from objects

Restoring corrupted images

Explanation - Image classification tasks involve predicting the category or label of the entire image, unlike detection or segmentation.

Correct answer is: Assigning a label to an entire image

Q.9 Object detection differs from image classification by:

Detecting objects without labels

Providing both object labels and their locations

Working only on grayscale images

Focusing only on image enhancement

Explanation - Unlike classification, object detection finds objects in an image and provides bounding boxes along with labels.

Correct answer is: Providing both object labels and their locations

Q.10 Which of these is NOT an application of computer vision?

Autonomous driving

Medical image analysis

Text summarization

Security surveillance

Explanation - Text summarization is an NLP task, not computer vision. Computer vision focuses on analyzing visual data.

Correct answer is: Text summarization

Q.11 Which color model is most commonly used in computer vision?

CMYK

RGB

HSV

XYZ

Explanation - The RGB color model is the most widely used for representing digital images in computer vision.

Correct answer is: RGB

Q.12 Which type of neural network is best suited for sequential image data like videos?

Convolutional Neural Networks

Recurrent Neural Networks

Autoencoders

Generative Adversarial Networks

Explanation - RNNs (often combined with CNNs) are suited for sequential data like videos where time dependencies matter.

Correct answer is: Recurrent Neural Networks

Q.13 Which preprocessing step helps improve model performance in image classification?

Image normalization

Word tokenization

Feature hashing

Stopword removal

Explanation - Normalization scales pixel values, which improves convergence during training of vision models.

Correct answer is: Image normalization

Q.14 Which type of problem does semantic segmentation solve?

Image-level labeling

Pixel-level labeling

Video frame prediction

Noise removal

Explanation - Semantic segmentation assigns a label to each pixel, distinguishing object boundaries in detail.

Correct answer is: Pixel-level labeling

Q.15 What does 'ImageNet' provide for computer vision research?

Audio samples

Large-scale labeled images

Text corpus

Speech datasets

Explanation - ImageNet is a large dataset of labeled images that is widely used for benchmarking computer vision models.

Correct answer is: Large-scale labeled images

Q.16 What is the role of convolution in CNNs?

To reduce dataset size

To extract features from images

To randomize pixels

To increase resolution

Explanation - Convolution operations apply filters that detect local patterns like edges, textures, or shapes in images.

Correct answer is: To extract features from images

Q.17 Which technique helps prevent overfitting in CNNs?

Dropout

Pooling

Batch normalization

Clustering

Explanation - Dropout randomly deactivates neurons during training, reducing overfitting in CNNs.

Correct answer is: Dropout

Q.18 Which of the following is an image augmentation technique?

Rotation

Stopword removal

Tokenization

Normalization of text

Explanation - Rotation is an image augmentation method used to artificially increase dataset size and variability.

Correct answer is: Rotation

Q.19 Optical Character Recognition (OCR) is used to:

Detect objects in images

Convert text in images into editable text

Translate spoken words

Detect emotions in faces

Explanation - OCR extracts text from images, enabling editable and searchable formats.

Correct answer is: Convert text in images into editable text

Q.20 Which filter highlights edges in an image?

Gaussian blur

Sobel filter

Mean filter

Median filter

Explanation - The Sobel filter emphasizes edges by detecting intensity gradients in an image.

Correct answer is: Sobel filter

Q.21 Which of these architectures is most commonly used for real-time object detection?

YOLO

RNN

KNN

Naive Bayes

Explanation - YOLO (You Only Look Once) is a popular CNN-based architecture for real-time object detection.

Correct answer is: YOLO

Q.22 What does a bounding box represent in object detection?

Region containing an object

Background area of image

Noise removal filter

Color histogram

Explanation - Bounding boxes are rectangular regions that specify the position of detected objects in images.

Correct answer is: Region containing an object

Q.23 In CNNs, 'stride' refers to:

The step size of the filter movement

The number of hidden layers

The dropout probability

The image resolution

Explanation - Stride determines how far the filter moves across the image during convolution.

Correct answer is: The step size of the filter movement

Q.24 Which of the following is a challenge in computer vision?

Variations in lighting and pose

High availability of labeled data

Easy interpretability of models

Unlimited computational resources

Explanation - Changes in lighting, scale, and pose make it difficult for vision systems to recognize objects consistently.

Correct answer is: Variations in lighting and pose

Q.25 Which deep learning technique is used for generating realistic images?

GANs

RNNs