Q.1 What is the primary goal of computer vision?
To simulate human thinking
To enable machines to understand visual data
To improve text processing
To generate random images
Explanation - Computer vision focuses on enabling machines to interpret and understand visual data like images and videos.
Correct answer is: To enable machines to understand visual data
Q.2 Which of the following is a typical application of computer vision?
Spam email filtering
Face recognition
Text translation
Audio synthesis
Explanation - Face recognition is a direct application of computer vision where systems identify or verify a person’s face in images or videos.
Correct answer is: Face recognition
Q.3 Which type of data is mainly used in computer vision?
Audio signals
Image and video data
Numerical tabular data
Text data
Explanation - Computer vision specifically deals with image and video data, unlike NLP or speech processing domains.
Correct answer is: Image and video data
Q.4 Which algorithm is most commonly used in modern computer vision tasks?
Decision Trees
Support Vector Machines
Convolutional Neural Networks
K-Means Clustering
Explanation - CNNs are widely used in computer vision because they are effective at extracting spatial features from images.
Correct answer is: Convolutional Neural Networks
Q.5 Edge detection in images is primarily used to:
Remove noise
Highlight object boundaries
Convert color images to grayscale
Enhance image brightness
Explanation - Edge detection identifies sudden changes in pixel intensity, which usually correspond to object boundaries.
Correct answer is: Highlight object boundaries
Q.6 Which technique reduces image size while retaining key information?
Upsampling
Pooling
Normalization
Augmentation
Explanation - Pooling layers in CNNs reduce the spatial dimensions of images, keeping essential features while reducing computational complexity.
Correct answer is: Pooling
Q.7 Which dataset is widely used for handwritten digit recognition?
ImageNet
CIFAR-10
MNIST
COCO
Explanation - The MNIST dataset is a benchmark dataset of handwritten digits, commonly used for training computer vision models.
Correct answer is: MNIST
Q.8 Image classification refers to:
Assigning a label to an entire image
Finding the location of objects in an image
Separating background from objects
Restoring corrupted images
Explanation - Image classification tasks involve predicting the category or label of the entire image, unlike detection or segmentation.
Correct answer is: Assigning a label to an entire image
Q.9 Object detection differs from image classification by:
Detecting objects without labels
Providing both object labels and their locations
Working only on grayscale images
Focusing only on image enhancement
Explanation - Unlike classification, object detection finds objects in an image and provides bounding boxes along with labels.
Correct answer is: Providing both object labels and their locations
Q.10 Which of these is NOT an application of computer vision?
Autonomous driving
Medical image analysis
Text summarization
Security surveillance
Explanation - Text summarization is an NLP task, not computer vision. Computer vision focuses on analyzing visual data.
Correct answer is: Text summarization
Q.11 Which color model is most commonly used in computer vision?
CMYK
RGB
HSV
XYZ
Explanation - The RGB color model is the most widely used for representing digital images in computer vision.
Correct answer is: RGB
Q.12 Which type of neural network is best suited for sequential image data like videos?
Convolutional Neural Networks
Recurrent Neural Networks
Autoencoders
Generative Adversarial Networks
Explanation - RNNs (often combined with CNNs) are suited for sequential data like videos where time dependencies matter.
Correct answer is: Recurrent Neural Networks
Q.13 Which preprocessing step helps improve model performance in image classification?
Image normalization
Word tokenization
Feature hashing
Stopword removal
Explanation - Normalization scales pixel values, which improves convergence during training of vision models.
Correct answer is: Image normalization
Q.14 Which type of problem does semantic segmentation solve?
Image-level labeling
Pixel-level labeling
Video frame prediction
Noise removal
Explanation - Semantic segmentation assigns a label to each pixel, distinguishing object boundaries in detail.
Correct answer is: Pixel-level labeling
Q.15 What does 'ImageNet' provide for computer vision research?
Audio samples
Large-scale labeled images
Text corpus
Speech datasets
Explanation - ImageNet is a large dataset of labeled images that is widely used for benchmarking computer vision models.
Correct answer is: Large-scale labeled images
Q.16 What is the role of convolution in CNNs?
To reduce dataset size
To extract features from images
To randomize pixels
To increase resolution
Explanation - Convolution operations apply filters that detect local patterns like edges, textures, or shapes in images.
Correct answer is: To extract features from images
Q.17 Which technique helps prevent overfitting in CNNs?
Dropout
Pooling
Batch normalization
Clustering
Explanation - Dropout randomly deactivates neurons during training, reducing overfitting in CNNs.
Correct answer is: Dropout
Q.18 Which of the following is an image augmentation technique?
Rotation
Stopword removal
Tokenization
Normalization of text
Explanation - Rotation is an image augmentation method used to artificially increase dataset size and variability.
Correct answer is: Rotation
Q.19 Optical Character Recognition (OCR) is used to:
Detect objects in images
Convert text in images into editable text
Translate spoken words
Detect emotions in faces
Explanation - OCR extracts text from images, enabling editable and searchable formats.
Correct answer is: Convert text in images into editable text
Q.20 Which filter highlights edges in an image?
Gaussian blur
Sobel filter
Mean filter
Median filter
Explanation - The Sobel filter emphasizes edges by detecting intensity gradients in an image.
Correct answer is: Sobel filter
Q.21 Which of these architectures is most commonly used for real-time object detection?
YOLO
RNN
KNN
Naive Bayes
Explanation - YOLO (You Only Look Once) is a popular CNN-based architecture for real-time object detection.
Correct answer is: YOLO
Q.22 What does a bounding box represent in object detection?
Region containing an object
Background area of image
Noise removal filter
Color histogram
Explanation - Bounding boxes are rectangular regions that specify the position of detected objects in images.
Correct answer is: Region containing an object
Q.23 In CNNs, 'stride' refers to:
The step size of the filter movement
The number of hidden layers
The dropout probability
The image resolution
Explanation - Stride determines how far the filter moves across the image during convolution.
Correct answer is: The step size of the filter movement
Q.24 Which of the following is a challenge in computer vision?
Variations in lighting and pose
High availability of labeled data
Easy interpretability of models
Unlimited computational resources
Explanation - Changes in lighting, scale, and pose make it difficult for vision systems to recognize objects consistently.
Correct answer is: Variations in lighting and pose
Q.25 Which deep learning technique is used for generating realistic images?
GANs
RNNs
Autoencoders
SVMs
Explanation - Generative Adversarial Networks (GANs) are used for generating realistic images by training two competing networks.
Correct answer is: GANs
