Final – Sample Problems

Problem 1. Assume that {e,c} are stable attributes and {a,d} are flexible.

Follow action forest algorithm to find action rules re-classifying objects in Table 2 with respect to d.

Find their confidence and support.

X
/ e / a / c / d
x1 / 1 / 3 / 1 / 2
x2 / 2 / 1 / 1 / 1
x3 / 1 / 3 / 2 / 2
x4 / 1 / 1 / 1 / 1
x5 / 2 / 3 / 2 / 1
x6 / 1 / 1 / 2 / 1

Table 2.

Problem 2. Find representative rules RR(3,75%) for the set of transactions: (A,B,C,D,F), (A,C,D,E,F), (A,B,C,E,H,I), (B,C,D,E,F), (A,C,D,E,F,H).

Problem 3.

Follow agglomerative strategy to cluster objects {y1,y2,…,y6} represented by the information systembelow.

Y / M / N
y1 / 1 / 2
y2 / 2 / 4
y3 / 6 / 2
y4 / 2 / 2
y5 / 4 / 1
y6 / 1 / 4

Use Manhattan distance / d(yi, yj) = |Mi – Mj | + |Ni – Nj | / for objects yi, yj and the distance d(R,Q)= 1/2d(A,Q) + 1/2d(B,Q) - 1/2d(A,B) between clusters R and Q, where R is formed by merging clusters A and B.

Problem 4.

Using the same data as in Problem 5 follow:

(1)Single Link Technique (maximal connected components in a graph) to find the clusters. Show the resulting dendogram.

(2)Complete Link Technique (looks for cliques) to find the clusters. Show the resulting dendogram.

(3)Minimum Spanning Tree (starts with complete graph, removes largest inconsistent edge) to find the clusters

Problem 5.

Discretize both attributes a and b in the Decision Table T(d). {a,b} are classification attributes.

X / a / b / d
x1 / 1 / 3 / 1
x2 / 10 / 8 / 2
x3 / 5 / 3 / 2
x4 / 3 / 8 / 2
x5 / 10 / 5 / 1
x6 / 5 / 8 / 2
x7 / 3 / 5 / 1

Decision Table T(d).

Problem 6.

Extract classification rules describing D in terms of A,B,C from the Table 2(seebelow) by followingtolerance relation approach.

X / A / B / C / D
x1 / a2 / b2 / c1 / d1
x2 / a2, a3 / b1 / c2,c3 / d2
x3 / a1 / c1, c3 / d2
x4 / a1, a2 / b3 / c2 / d1
x5 / a1, a2 / b1 / c2 / d1
x6 / a2 / b2 / c3 / d2
x7 / b1, b3 / c1, c2 / d2
x8 / a3 / b2 / c1 / d1

Table 2

Problem 7.

Sequence of itemsets purchased by customers C1, C2, C3 is given below.

Customer / Time / Itemset
C1 / 10 / ABD
C1 / 20 / BC
C2 / 20 / CD
C3 / 15 / AD
C3 / 20 / CD

Find all frequent itemsets and some association rules.

Problem 8.

Consider the following table describing objects {x1,x2,…,x7} by attributes {a,b,c}.

X
/ a / b / c
x1 / 10 / 4 / 10
x2 / 12 / 10 / 5
x3 / 4 / 2 / 2
x4 / 8 / 4 / 2
x5 / 4 / 8 / 8
x6 / 10 / 2 / 5
x7 / 8 / 8 / 10

Construct binary TV-tree with threshold 2 for an attributes to be active. Show how to use the tree to find the closest document to the one represented by [9, 3, 2].