Stream Data Mining # MCQs Practice set

Q.1 What is the primary challenge in stream data mining compared to traditional data mining?

Data is static and small

Data arrives continuously and rapidly

Data is always clean and structured

There are no storage limitations

Explanation - Stream data mining deals with continuous and potentially unbounded data, requiring algorithms that can process data in real-time without storing everything.

Correct answer is: Data arrives continuously and rapidly

Q.2 Which of the following is a common approach in stream data mining to handle unbounded data?

Storing all historical data

Windowing techniques

Ignoring old data

Manual analysis of batches

Explanation - Windowing techniques allow algorithms to focus on the most recent data, making real-time analysis feasible without storing the entire stream.

Correct answer is: Windowing techniques

Q.3 What is a sliding window in stream data mining?

A technique to visualize data streams

A fixed-size subset of the most recent data items

A method to delete data permanently

A type of database index

Explanation - A sliding window captures a fixed number of recent data points and updates as new data arrives, allowing for real-time analysis.

Correct answer is: A fixed-size subset of the most recent data items

Q.4 Which algorithm is widely used for frequent pattern mining in streams?

Apriori

FP-Stream

K-Means

Decision Tree

Explanation - FP-Stream is specifically designed for mining frequent patterns over streaming data using incremental and window-based approaches.

Correct answer is: FP-Stream

Q.5 In stream clustering, which algorithm adapts to changing data distribution over time?

DBSCAN

CluStream

K-Means on static data

Naive Bayes

Explanation - CluStream maintains micro-clusters and updates them over time, allowing clustering to adapt to evolving data streams.

Correct answer is: CluStream

Q.6 What is concept drift in stream data mining?

Data becomes clean over time

Data distribution changes over time

Data stops arriving

All data points become identical

Explanation - Concept drift refers to changes in the underlying patterns of the stream, requiring adaptive models to maintain accuracy.

Correct answer is: Data distribution changes over time

Q.7 Which technique is commonly used to detect concept drift?

Decision tree pruning

Statistical tests and monitoring error rates

Storing all historical data

Data normalization

Explanation - Monitoring model performance and using statistical tests can indicate when the data distribution has shifted, signaling concept drift.

Correct answer is: Statistical tests and monitoring error rates

Q.8 Why are traditional batch learning algorithms often unsuitable for stream data mining?

They cannot handle labeled data

They require multiple passes over all data

They are too fast

They work only on numeric data

Explanation - Stream data is continuous and unbounded, so algorithms that need multiple passes over the entire dataset are impractical.

Correct answer is: They require multiple passes over all data

Q.9 Which type of data summarization is commonly used in stream mining?

Exact storage of all events

Synopsis data structures like sketches and histograms

Manual logging

Storing only the first data point

Explanation - Synopsis structures provide compact summaries of large data streams, allowing approximate answers while reducing memory usage.

Correct answer is: Synopsis data structures like sketches and histograms

Q.10 What is the main advantage of online learning algorithms in stream mining?

They store the full dataset

They update the model incrementally with each new data point

They need multiple passes over data

They ignore new data

Explanation - Online learning algorithms continuously update their model with incoming data, which is essential for real-time stream analysis.

Correct answer is: They update the model incrementally with each new data point

Q.11 In stream classification, which approach helps maintain accuracy in the presence of concept drift?

Static decision trees

Ensemble learning and adaptive classifiers

Storing all historical data

Ignoring old errors

Explanation - Adaptive classifiers and ensembles can adjust to changes in data distribution, improving robustness against concept drift.

Correct answer is: Ensemble learning and adaptive classifiers

Q.12 What is the difference between a landmark window and a sliding window in stream mining?

Landmark window stores old data permanently, sliding window uses only recent data

Sliding window stores old data permanently, landmark window uses recent data

Both are identical

Neither stores any data

Explanation - Landmark windows consider all data since a specific point (landmark), while sliding windows consider only the most recent data.

Correct answer is: Landmark window stores old data permanently, sliding window uses only recent data

Q.13 Which of the following is NOT a common stream data mining task?

Classification

Clustering

Frequent pattern mining

Manual spreadsheet editing

Explanation - Stream data mining focuses on automated tasks like classification, clustering, and pattern mining rather than manual operations.

Correct answer is: Manual spreadsheet editing

Q.14 What is micro-clustering in the context of stream clustering?

Creating clusters of very large datasets

Maintaining summary statistics of small clusters over time

Clustering historical data only

Clustering only numeric attributes

Explanation - Micro-clustering keeps compact summaries of data points, which can later be merged or analyzed to form macro clusters.

Correct answer is: Maintaining summary statistics of small clusters over time

Q.15 Which stream mining algorithm is suitable for detecting rare events?

Hoeffding Tree

SWIM (Sliding Window Interestingness Mining)

K-Means

Apriori

Explanation - SWIM focuses on detecting unusual or rare patterns in a sliding window of stream data, which standard algorithms may miss.

Correct answer is: SWIM (Sliding Window Interestingness Mining)

Q.16 Why is memory management critical in stream data mining?

Streams are always small

Streams are unbounded and cannot be fully stored

Streams do not require processing

Memory has no effect on algorithm speed

Explanation - Because data streams are potentially infinite, algorithms must summarize or selectively store data to operate within memory limits.

Correct answer is: Streams are unbounded and cannot be fully stored

Q.17 Which of the following is a challenge unique to stream data mining?

Data cleaning

Real-time processing of evolving data

SQL query execution

Data normalization

Explanation - Unlike traditional mining, stream mining must handle data that evolves over time and requires real-time analysis.

Correct answer is: Real-time processing of evolving data

Q.18 Which evaluation metric is commonly used to measure stream classification performance?

Mean squared error

Accuracy over time with incremental updates

Pearson correlation

Disk space usage

Explanation - Stream classification often uses metrics like time-dependent accuracy or F1-score to monitor performance as the model adapts to new data.

Correct answer is: Accuracy over time with incremental updates

Q.19 What is the role of synopsis data structures like count-min sketch in stream mining?

Exact storage of all data

Compact approximation of frequencies or aggregates

Visualizing clusters

Sorting incoming data

Explanation - These structures provide memory-efficient approximations for queries like frequency counts, suitable for high-speed streams.

Correct answer is: Compact approximation of frequencies or aggregates

Q.20 In stream association rule mining, why are exact counts often impractical?

Data streams are small

Streams are unbounded and memory is limited

Rules are always irrelevant

No algorithm exists for counting

Explanation - Since data streams are potentially infinite, approximate counting using windowing or synopsis structures is necessary.

Correct answer is: Streams are unbounded and memory is limited

Q.21 What is the Hoeffding bound used for in stream classification?

To determine confidence in model updates with limited data

To compute exact cluster centers

To visualize sliding windows

To sort streaming data

Explanation - The Hoeffding bound provides a statistical guarantee that decisions made using a subset of data are likely to be correct, enabling incremental learning.

Correct answer is: To determine confidence in model updates with limited data

Q.22 Which of the following is an advantage of incremental algorithms in stream mining?

They discard new data

They process each data point once and update models

They require full batch processing

They ignore changes in data

Explanation - Incremental algorithms are suitable for streams because they update the model on the fly without multiple passes or full storage.

Correct answer is: They process each data point once and update models

Q.23 What is the main purpose of data aging in stream mining?

To store all historical data permanently

To gradually reduce the influence of old data

To ignore new incoming data

To normalize data

Explanation - Data aging ensures that recent trends have more impact on the model than outdated information, improving adaptability.

Correct answer is: To gradually reduce the influence of old data

Q.24 Which scenario is ideal for applying stream data mining?

Analyzing historical census data

Monitoring network traffic in real-time

Batch processing of financial reports

Data entry in spreadsheets

Explanation - Stream data mining is suitable for real-time applications where data arrives continuously and timely insights are critical.

Correct answer is: Monitoring network traffic in real-time

Q.25 Which property distinguishes data streams from static datasets?

Finite and fixed

Continuous and potentially unbounded

Always numeric

Never evolving

Explanation - Data streams are characterized by their continuous flow and potentially infinite size, unlike static datasets which are finite.

Correct answer is: Continuous and potentially unbounded

Q.1 What is the primary challenge in stream data mining compared to traditional data mining?

Q.2 Which of the following is a common approach in stream data mining to handle unbounded data?

Q.3 What is a sliding window in stream data mining?

Q.4 Which algorithm is widely used for frequent pattern mining in streams?

Q.5 In stream clustering, which algorithm adapts to changing data distribution over time?

Q.6 What is concept drift in stream data mining?

Q.7 Which technique is commonly used to detect concept drift?

Q.8 Why are traditional batch learning algorithms often unsuitable for stream data mining?

Q.9 Which type of data summarization is commonly used in stream mining?

Q.10 What is the main advantage of online learning algorithms in stream mining?

Q.11 In stream classification, which approach helps maintain accuracy in the presence of concept drift?

Q.12 What is the difference between a landmark window and a sliding window in stream mining?

Q.13 Which of the following is NOT a common stream data mining task?

Q.14 What is micro-clustering in the context of stream clustering?

Q.15 Which stream mining algorithm is suitable for detecting rare events?

Q.16 Why is memory management critical in stream data mining?

Q.17 Which of the following is a challenge unique to stream data mining?

Q.18 Which evaluation metric is commonly used to measure stream classification performance?

Q.19 What is the role of synopsis data structures like count-min sketch in stream mining?

Q.20 In stream association rule mining, why are exact counts often impractical?

Q.21 What is the Hoeffding bound used for in stream classification?

Q.22 Which of the following is an advantage of incremental algorithms in stream mining?

Q.23 What is the main purpose of data aging in stream mining?

Q.24 Which scenario is ideal for applying stream data mining?

Q.25 Which property distinguishes data streams from static datasets?

Privacy & Cookie Consent