I'm currently taking a Coursera course on Pattern Discovery as part of the Data Mining Specialization. What follows is a series of blog posts that summarize what I've learned from the course.
Let us start with some terminology:
patterns: groups of items, subsequences, or substructures that appear often together within the dataset, a.k.a strongly correlated.
item: a single datapoint in the dataset. In the classic examples, this is a grocery product. It could also be a physical metric like temperature, an event like opening a web link, or many other things.
itemset: a set of one or more items. Intrinsically unordered, but most algorithms will sort itemsets into a specific order, usually by support (see below).
k-itemset: a set of k items, e.g. 10-itemset, 3-itemset, 1-itemset (yes that's a thing).
transaction: a group of items (itemset) that occurred together. in the classic examples, this is a basket of groceries purchased together.
support: a property of itemsets. the frequency of transactions containing that itemset in the dataset. May be expressed as an integer or decimal fraction (see below).
absolute support: the absolute count of transactions containing the itemset.
relative support: the fraction of transactions in the dataset containing the itemset. Often expressed as a percentage.
minsup: minimum support. itemsets with support below this threshold are uninteresting or meaningless.
frequent: an itemset is said to be frequent when it meets the minsup threshold.