Sequential Pattern Mining # MCQs Practice set

Q.1 What is sequential pattern mining primarily used for?

Finding frequent itemsets in a single transaction
Discovering frequent sequences of events or items over time
Classifying data into categories
Reducing dimensionality of datasets
Explanation - Sequential pattern mining identifies sequences of items or events that frequently occur in a specific order over time in a dataset.
Correct answer is: Discovering frequent sequences of events or items over time

Q.2 Which algorithm is specifically designed for sequential pattern mining?

Apriori
FP-Growth
GSP (Generalized Sequential Pattern)
K-Means
Explanation - The GSP algorithm is designed to find sequential patterns by extending the Apriori principle to sequences.
Correct answer is: GSP (Generalized Sequential Pattern)

Q.3 In sequential pattern mining, what does a 'sequence' consist of?

A set of unrelated transactions
An ordered list of itemsets
A single item repeated
A hierarchical cluster of data points
Explanation - A sequence is an ordered list where each element can be an item or a set of items occurring together, representing temporal order.
Correct answer is: An ordered list of itemsets

Q.4 Which of the following is an example of sequential pattern mining in real life?

Clustering customers based on age
Finding purchase patterns in online shopping carts over time
Classifying emails as spam or not spam
Reducing features in high-dimensional data
Explanation - Sequential pattern mining is used to find temporal patterns, such as which products are often bought in sequence.
Correct answer is: Finding purchase patterns in online shopping carts over time

Q.5 What is the primary difference between frequent itemset mining and sequential pattern mining?

Sequential pattern mining ignores order, itemset mining considers order
Sequential pattern mining considers order, itemset mining ignores order
Itemset mining works only on numeric data
Sequential pattern mining works only on categorical data
Explanation - Sequential pattern mining requires the order of items to be preserved, unlike frequent itemset mining, which treats items as unordered.
Correct answer is: Sequential pattern mining considers order, itemset mining ignores order

Q.6 Which data structure is commonly used to efficiently store sequences in sequential pattern mining?

Hash Table
Sequence Tree
Graph
Adjacency Matrix
Explanation - Sequence trees, such as prefix trees, are used to store sequences efficiently for mining sequential patterns.
Correct answer is: Sequence Tree

Q.7 In GSP, what does the 'support' of a sequence refer to?

The total number of items in the sequence
The fraction of sequences containing that sequence
The maximum length of the sequence
The frequency of the first item only
Explanation - Support measures how often a sequence appears in the dataset relative to all sequences, guiding frequent pattern discovery.
Correct answer is: The fraction of sequences containing that sequence

Q.8 Which step comes first in the GSP algorithm?

Candidate sequence generation
Support counting
Finding frequent 1-sequences
Pruning infrequent sequences
Explanation - GSP begins by identifying all frequent 1-sequences, which are then used to generate longer candidate sequences iteratively.
Correct answer is: Finding frequent 1-sequences

Q.9 What is the main challenge in sequential pattern mining?

Handling unordered data
Efficiently finding long sequences in large datasets
Normalizing numeric data
Labeling sequences for classification
Explanation - The challenge is computational efficiency since the number of potential sequences grows exponentially with sequence length.
Correct answer is: Efficiently finding long sequences in large datasets

Q.10 Which algorithm improves upon GSP by using a pattern-growth approach?

Apriori
PrefixSpan
FP-Growth
KNN
Explanation - PrefixSpan grows sequences by exploring only the projected databases of frequent prefixes, avoiding candidate generation of GSP.
Correct answer is: PrefixSpan

Q.11 In sequential pattern mining, what is a 'projected database'?

Subset of sequences containing a given prefix
A database of all single items
The input dataset transformed into clusters
All sequences with support below threshold
Explanation - A projected database contains all suffixes of sequences that share a common prefix, used in pattern-growth methods like PrefixSpan.
Correct answer is: Subset of sequences containing a given prefix

Q.12 Which of the following is not a typical application of sequential pattern mining?

Web clickstream analysis
DNA sequence analysis
Market basket analysis
Image edge detection
Explanation - Sequential pattern mining focuses on ordered event data, not image processing tasks like edge detection.
Correct answer is: Image edge detection

Q.13 How does PrefixSpan reduce the candidate sequence problem?

By scanning the database multiple times
By generating candidates only from frequent prefixes
By ignoring infrequent single items
By using clustering before mining
Explanation - PrefixSpan focuses on pattern growth from frequent prefixes, avoiding the combinatorial explosion of candidate sequences in GSP.
Correct answer is: By generating candidates only from frequent prefixes

Q.14 What does 'min_sup' represent in sequential pattern mining?

Minimum length of sequences
Minimum support threshold for frequent sequences
Maximum number of sequences allowed
Minimum number of items in a sequence
Explanation - min_sup defines the minimum frequency a sequence must have to be considered significant or frequent.
Correct answer is: Minimum support threshold for frequent sequences

Q.15 What is the key idea of SPAM (Sequential Pattern Mining using a Bitmap)?

Using vertical bitmap representation for sequences
Applying clustering to sequences
Generating all possible sequences first
Reducing the number of items in sequences
Explanation - SPAM uses bitmaps to represent sequences efficiently and computes support using bitwise operations.
Correct answer is: Using vertical bitmap representation for sequences

Q.16 Which of the following best describes a 'contiguous sequence' in sequential mining?

Items appear consecutively without gaps
Items appear at least once in any order
Items are grouped by frequency
Items appear randomly
Explanation - A contiguous sequence requires items to appear one after another in the dataset without interruption.
Correct answer is: Items appear consecutively without gaps

Q.17 In sequential pattern mining, a 'maximal sequence' is:

A sequence contained within another sequence
A frequent sequence that is not a subsequence of any other frequent sequence
A sequence with minimum support
The shortest frequent sequence
Explanation - Maximal sequences are useful for concise representation since no longer frequent sequence contains them.
Correct answer is: A frequent sequence that is not a subsequence of any other frequent sequence

Q.18 Why is pruning important in sequential pattern mining?

To reduce dataset size
To eliminate infrequent candidate sequences and save computation
To normalize sequence data
To rank sequences by length
Explanation - Pruning removes sequences that cannot be frequent, reducing unnecessary computation during mining.
Correct answer is: To eliminate infrequent candidate sequences and save computation

Q.19 Which sequential pattern mining approach avoids candidate generation entirely?

GSP
PrefixSpan
Apriori
Breadth-first search
Explanation - PrefixSpan uses a pattern-growth method that avoids generating all candidate sequences explicitly, unlike GSP.
Correct answer is: PrefixSpan

Q.20 How does sequential pattern mining differ from association rule mining?

It focuses on unordered itemsets
It focuses on ordered sequences of items
It does not consider frequency
It only works on numeric data
Explanation - Sequential pattern mining emphasizes the temporal order of items, while association rule mining ignores order.
Correct answer is: It focuses on ordered sequences of items

Q.21 In a customer transaction dataset, which of the following is a sequential pattern?

{Bread} → {Milk} → {Eggs}
{Bread, Milk, Eggs}
Bread, Milk, Eggs randomly
Bread or Milk or Eggs
Explanation - The arrow indicates the order in which items are purchased, forming a sequential pattern.
Correct answer is: {Bread} → {Milk} → {Eggs}

Q.22 Which technique helps in reducing the size of mined sequential patterns?

Using maximal or closed sequences
Increasing min_sup arbitrarily
Ignoring rare items
Clustering sequences
Explanation - Maximal and closed sequences provide concise representations, reducing redundancy in results.
Correct answer is: Using maximal or closed sequences

Q.23 In sequence mining, what is a 'subsequence'?

A sequence that appears randomly
A sequence formed by removing some items from another sequence without changing order
A sequence containing all items in a different order
A sequence of only one item
Explanation - Subsequences preserve the original order but may skip some items from the parent sequence.
Correct answer is: A sequence formed by removing some items from another sequence without changing order

Q.24 Which of the following is a limitation of traditional sequential pattern mining algorithms?

Cannot handle large datasets efficiently
Only works on numeric data
Ignores item frequency
Cannot find patterns in order
Explanation - Traditional algorithms like GSP may become computationally expensive on large datasets due to candidate generation and multiple scans.
Correct answer is: Cannot handle large datasets efficiently