Hands-On Lab
Building Your First Data Mining Model with SQL Server 2008 R2 Analysis Services
Lab version: 1.0.0
Last updated: 5/3/2011
Contents
Overview 3
Exercise 1: Embedding Data Mining Results Into a Custom Application 4
Task 1 – Browsing the Adventure Works Online Shopping Application 5
Task 2 – Opening the AdventureWorksBI Solution 6
Task 3 – Creating the Basket Analysis Data Source View 7
Task 4 – Configuring the Basket Analysis Data Source View 8
Task 5 – Creating the Basket Analysis Mining Model 9
Task 6 – Configuring the Basket Analysis Mining Model Algorithm Parameters 10
Task 7 – Processing the Basket Analysis Mining Model 11
Task 8 – Viewing the Basket Analysis Mining Model Content 12
Task 9 – Querying the Basket Analysis Mining Model 15
Task 10 – Enhancing the Adventure Works Online Shopping Application 19
Task 11 – Browsing the Enhanced Adventure Works Online Shopping Application 20
Task 12 – Finishing Up 20
Summary 21
Overview
This lab will create a data mining model that uses the Microsoft Association Rules algorithm to identify patterns about models commonly purchased together. The data mining model will be used to provide relevant purchasing suggestions to online customers.
Note: Before you start with this exercise you must ensure that your machine meets the system requirements detailed in the next section. Additionally, you must complete the setup steps described in the next section.
Objectives
The objectives of this exercise are to:
· Create a data source view
· Create a Microsoft Association Rules data mining model
· View the mining model content
· Query the mining model
· Embed the mining model query results into a Web application
System Requirements
You must have installed the following items to complete this lab:
· Microsoft SQL Server 2008 R2:
◦ Database Engine
◦ Analysis Services
◦ SQL Server Business Intelligence Development Studio
· SQL Server AdventureWorks2008 R2 sample databases
◦ AdventureWorks2008R2
◦ AdventureWorksDW2008R2
· Microsoft Visual Studio 2010 SP1
◦ Visual Basic
◦ Visual Web Developer
Setup
All the requisites for this lab are verified using the Configuration Wizard. To make sure that everything is correctly configured, follow these steps.
Note: To perform the setup steps you need to run the scripts in a command window with administrator privileges.
1. Launch the Configuration Wizard for this lab by double-clicking the Dependencies.dep file located under the Source\Setup folder of this lab. Install any pre-requisites that are missing (rescanning if necessary) and complete the wizard.
Cleanup
There is no need to cleanup if you intend to continue the sequence of labs in this training kit.
1. To restore the original state of the AdventureWorks2008R2 and AdventureWorksDW2008R2 SQL Server databases and remove the Sales Analysis Analysis Services database, execute the Cleanup.cmd script located under the Setup folder in the Source folder of this lab.
Exercises
This Hands-On Lab comprises the following exercise:
1. Embedding Data Mining Results Into a Custom Application
Estimated time to complete this lab: 30 minutes.
Exercise 1: Embedding Data Mining Results Into a Custom Application
In this exercise, you will develop a data mining model that uses the Microsoft Association Rules algorithm to identify rules about models commonly purchased together. This type of data mining is called market basket analysis. The patterns discovered by the data mining model will be used by the Adventure Works Online Shopping Web application to cross-promote models by suggesting relevant models during the shopping cart checkout.
Task 1 – Browsing the Adventure Works Online Shopping Application
In this task, you will explore the Adventure Works Online Shopping Web application to understand how it presently delivers suggestions during check out.
1. Open Visual Studio 2010 from Start | All Programs | Microsoft Visual Studio 2010 | Microsoft Visual Studio 2010.
2. If prompted to choose default environment settings (required the first time Visual Studio is launched), select Visual Basic Development Settings, and then click Start Visual Studio.
3. To open the AWOnlineShopping solution, on the File menu, select Open | Project/Solution.
4. In the Open Project window, navigate to the
Ex1-EmbeddingDataMining\Begin\AWOnlineShopping folder located in the Source folder for this lab, select the AWOnlineShopping.sln file, and then click Open.
5. On the Debug menu, select Start Without Debugging.
6. When the Internet Explorer window opens, if required, maximize the window.
7. On the menu (located on the left), select Catalog by Category.
Figure 1
Selecting the menu item
8. On the Catalog by Category page, in the Product list, click the Mountain-200 Black, 38 link.
9. On the Product Details page, click Add to Shopping Cart.
10. On the Shopping Cart page, notice the three suggestions at the bottom of the page.
11. Click the Display Database Command label, and then review the database command.
Figure 2
Reviewing the database command
Note: These suggestions were retrieved by a relational database stored procedure. They represent a static collection of suggestions, and as such they do not take into consideration items already added to the shopping cart. Clearly, the suggestion to purchase a Mountain-200 is no longer relevant.
12. Close the Internet Explorer window.
13. Leave Visual Studio open.
Task 2 – Opening the AdventureWorksBI Solution
In this task, you will open an existing solution that consists of the completed labs in this training course. You will then configure the deployment properties for the Sales Analysis Analysis Services project. In this exercise, you will be extending this project to include a new data source view and data mining structure.
1. Open SQL Server Business Intelligence Development Studio from Start | All Programs | Microsoft SQL Server 2008 R2 | SQL Server Business Intelligence Development Studio.
2. To open the AdventureWorksBI solution, on the File menu, select Open | Project/Solution.
3. In the Open Project window, navigate to the
Ex1-EmbeddingDataMining\Begin\AdventureWorksBI folder located in the Source folder for this lab, select the AdventureWorksBI.sln file, and then click Open.
Note: This solution consists of all completed labs that precede this lab in the training course.
4. In Solution Explorer, if necessary, collapse the Populate DW and Sales Reports projects.
5. In Solution Explorer, right-click the Sales Analysis project, and then select Properties.
6. In the Sales Analysis Property Pages window, select the Deployment page, set the Server property to <servername>, and then click OK.
Note: You will need to substitute <servername> for the name of the machine that hosts Analysis Services.
7. To save the solution, on the File menu, select Save All.
Task 3 – Creating the Basket Analysis Data Source View
In this task, you will you create the Basket Analysis data source view. The data source view will be the foundation upon which the data mining model in this exercise will be developed.
8. In Solution Explorer, expand the Sales Analysis project, right-click the Data Source Views folder, and then select New Data Source View.
9. In the Data Source View Wizard, read the welcome message, and then click Next.
10. In the Select a Data Source step, notice that the Adventure Works DW2008R2 data source is selected, and then click Next.
11. In the Select Tables and Views step, in the Available Objects list, scroll to the bottom of the list.
12. While pressing the Control key, select the v2008Order and v2008OrderLine views.
13. Click the arrow to add the selected tables to the Included Objects list.
Figure 3
Adding the views to the Included Objects List
14. Click Next.
15. In the Completing the Wizard step, in the Name box, replace the text with Basket Analysis, and then click Finish.
16. When the wizard completes, in Solution Explorer, notice the addition of the Basket Analysis data source view, and that the data source view designer opens automatically.
17. To save the solution, on the File menu, select Save All.
Task 4 – Configuring the Basket Analysis Data Source View
In this task, you will refine the design of the data source view. This will involve providing friendly names for each of the data source view tables, defining a logical primary key and establishing a relationship between the tables.
1. To rename the tables, in the data source view designer, in the Tables pane (located in the bottom left corner), select the v2008Order table, and then in the Properties window, modify the FriendyName property to Order.
Note: If the Properties window is not visible, on the View menu, select Properties Window.
2. Repeat the last step for the v2008OrderLine table, and modify the FriendlyName property to Basket.
Note: The purpose of this step is to create a user-friendly data model. It is important to configure friendly names at the data source view level so that they are consistently inherited throughout the objects (cubes, dimension and, mining models) created upon this view.
3. To define the primary key in the Order table, in the Order table, right-click the OrderNumber column, and then select Set Logical Primary Key.
4. To establish a relationship between the Basket table and the Order table, in the Basket table, drag the OrderNumber column on top of the OrderNumber column in the Order table.
Figure 4
Establishing the relationship between the tables
5. To arrange the tables, right-click in a blank area of the diagram, and then select Arrange Tables.
6. To explore the data in the Basket table, in the Tables pane (or the diagram), right-click the Basket table, and then select Explore Data.
7. In the explorer window, notice that many orders include many models.
Note: The data mining model that you will develop in this exercise will produce a model to describe the relationships between models purchased together (in the same order).
8. To close the explorer window, on the File menu, select Close.
9. To close the data source view designer, on the File menu, select Close.
10. On the File menu, click Save All.
Task 5 – Creating the Basket Analysis Mining Model
In this task, you will use the Data Mining Wizard to create the BasketAnalysis_AR mining model.
1. In Solution Explorer, in the Sales Analysis project, right-click the Mining Structures folder, and then select New Mining Structure.
2. In the Data Mining Wizard, read the welcome message, and then click Next.
3. In the Select the Definition Method step, notice the default selection, and then click Next.
4. In the Create the Data Mining Structure step, in the dropdown list, select the Microsoft Association Rules data mining algorithm, and then click Next.
5. In the Select Data Source View step, in the Available Data Source Views list, select the Basket Analysis data source view, and then click Next.
6. In the Specify Table Types step, specify the table types as shown, and then click Next.
Figure 5
Specifying the table types
7. In the Specify the Training Data step, specify the columns to use in the mining model as shown, and then click Next.
Figure 6
Specifying the training data
8. In the Specify Columns' Content and Data Type step, click Next.
9. In the Create Testing Set step, reduce the Percentage of Data for Testing value to 0, and then click Next.
Note: It is very important that you follow the lab instructions precisely, particularly when naming objects. This lab includes code that expect objects have been named correctly.
10. In the Completing the Wizard step, in the Mining Structure Name box, replace the text with BasketAnalysis, and in the Mining Model Name box, replace the text with BasketAnalysis_AR.
11. Click Finish.
12. When the wizard completes, in Solution Explorer, notice the addition of the Basket Analysis mining structure, and that the mining structure designer opens automatically.
13. On the File menu, click Save All.
Task 6 – Configuring the Basket Analysis Mining Model Algorithm Parameters
In this task, you will configure the Basket Analysis mining model algorithm parameters.
1. In the mining structure designer, select the Mining Models tab.
2. Right-click the BasketAnalysis_AR model, and then select Set Algorithm Parameters.
Figure 7
Opening the Algorithm Parameters window
3. In the Algorithm Parameters window, configure the Value property for the MINIMUM_PROBABILITY and MINIMUM_SUPPORT parameters as shown.
Figure 8
Configuring the algorithm parameters
Note: The two parameters configured here define the sensitivity of the thresholds used to analyze the data when the mining model processes.
4. Click OK.
Task 7 – Processing the Basket Analysis Mining Model
In this task, you will process the Basket Analysis mining model. Once processed, the mining model will contain the patterns and statistics that describe the relationships between frequently purchased models.
1. In Solution Explorer, inside the Sales Analysis project, right-click the BasketAnalysis mining structure, and then select Process.
2. If prompted to build and deploy the project, click Yes.
3. If prompted to overwrite the database, click Yes.
4. In the Process Mining Structure window, click Run.
Note: The deployment process creates and processes the mining structure. At this time, the data is retrieved from the data source, and the Microsoft Association Rules algorithm correlates and identifies frequent relationships across attribute values, which in this case are product models.
5. When processing completes, in the Process Progress window, click Close.
6. In the Process Mining Structure window, click Close.
Task 8 – Viewing the Basket Analysis Mining Model Content
In this task, you will use three mining model viewers to explore and understand the model content.
1. In the mining structure designer, select the Mining Model Viewer tab.