Seminar 3 SAS
Queries
1. Selecting columns and filtering fields
Exercise 1: Create a frequency report showing the number of suppliers the ABC company has in every country except the United States (U.S.). Prepare data for analysis by creating a query on the Suppliers table, table created by importing the Suppliers spreadsheet of the SupplyInfo.xls file. Include in the query only the suppliers from outside U.S. and only the following columns in this order: Country, CompanyName, City.
Follow these steps:
1. Open a new project and import the Suppliers spreadsheet of the SupplyInfo.xls file, according to the instructions in Seminar1 SAS Engl.
2. With the Suppliers spreadsheet selected in the project tree, press Query Builder.
3. Rename the query Non_US_ Suppliers.
4. Specify the name and location of the query results, selecting and giving the resulting file the name NonUS Suppliers
5. Drag the fields Country, CompanyName and City in the Select Data tab.
6. Filter the data to exclude the rows that have the country USA (Filter Data table, New Filter button). Collect the values of data to be compared from the SAS table using the option
7. Select Run to get out of Query Builder and generate the results of the query. Is should return 25 rows.
8. Often, by default, the query will result in a SAS report. The report can not be used further, for processing data included in the query. To get the frequency report, you must change the query result so that it returns a SAS table. Right-click the query Non_US_ Suppliers in the project tree ->Modify Non_US_ Suppliers. Press Options->Options for This Query-> in the Results panel, Results format, select Override the Corresponding default ... and select Data table if it is not selected. Select Run. Replace the previous results of running the query by selecting the Yes button in the dialog window.
9. To get the number of suppliers in each country, select the query, then Tasks ðDescribeð One-Way Frequencies…
10. In the Data tab, assign to the Country variable the role of Analysis variable.
11. In the Statistics tab, select only the option of showing frequencies and percentages.
12. In order to display at the beginning of the report the countries with the most suppliers, select the Results tab and change the options of the Order output data by field to Descending frequencies
13. Select Run.
14. Save the project
2. Creating a calculated column
Exercise 2: Use Query Builder to create a new a column that calculates the projected inventory, which is calculated as the sum of units in stock and those ordered.
Follow these steps:
1. Add to the project created previously the product table.
2. With the products table selected in the project tree, press Query Builder
3. Rename the query Projected Inventory Query
4. Specify the name and location of query results selecting and giving the resulting file the name ProjectedInventory.
5. In the Select Data tab place all table columns
6. Press Computed Columns, New.
7. Select the option that allows you to calculate a new column based on a formula, then Next.
8. The formula for the new column is UnitsInStock + UnitsOnOrder.
Attention! Use column references from the Tables tree of the advanced expression editor when inserting the formula!
9. Rename the column ProjectedInventory.
10. Notice that the new column is added to the Select Data tab.
11. Save the project.
3. Creating a recoded column
Exercise 3: Edit the options of Projected Inventory query to calculate a new column named UnitCostLevel, grouping values into three categories: from $ 0 to $ 14.99, from $ 15 to $ 29.99, and over $ 30.
Follow these steps:
1. In the previously created project select Projected Inventory Query, right click, Modify…
2. Press Computed Columns, New.
3. Select the option that allows you to calculate a recoded column, then Next.
4. Select UnitCost column, then Next.
5. Press Add, Replace a Range tab.
6. Establish three categories for product prices. Set a lower limit option to 0 and Set an upper limit to 14.99. In the With this value field write “0 to 14.99 dollars”. Press OK.
7. Repeat operations from 5. And 6. for the other two value ranges.
8. To recode through missing values all the other possible values, select in Other values, the A Missing Value option.
9. Specify that the new column is character
10. Name the new column UnitCostLevel.
11. Do not assign any display format for the calculated column.
12. Select OK ð Close
13. Select Run and Yes when asked if you want to keep the previous query results.
14. Select in the project tree or in the Process Flow window the ProjectedInventory table. Analyze the data in the table using Tasks ð Describe ð Table Analysis.
15. In the Data tab, assign to the UnitCostLevel and CategoryName columns the role of table variables.
16. In the Tables tab, in the Preview window, place the UnitCostLevel column over the <drag variables here> field and the CategoryName column over the table.
17. Select Run and examine the results. What are the categories with the most products in each of the three price ranges? But the least? What is the most balanced category represented the three classes of prices?
18. Save the project.
4. Joining table
Exercise 4: Use the Query Builder tool to achieve the join of three tables containing information about suppliers, products and orders. Create a new column that calculates the profit for each order.
Follow these steps:
1. Right click on the products table and select Query builder
2. Name the query Profit Query and the resulting table Profit.
3. Include other tables in the query using Add Tables….
4. From the active project add the table obtained from the import of the Suppliers spreadsheet of the SupplyInfo.xls file. The name of this table is SASUSER.SUPPLIERS
5. Also include in the query the ordersFinal table, from the location in which you saved the seminar working data (the Date Seminar SAS folder)
6. SAS Enterprise Guide tries to find the corresponding columns of the three tables on which to perform the join operation. Since there is no common column between the ordersFinal table and the two other included tables the following message is displayed:
7. Select OK.
8. Observe the join performed automatically between the PROFIT and SUPPLIERS tables, due to the existence of a common column, SupplierID with the same name and type in both tables.
9. To perform a manual join between the ordersFinal and products tables, click on the ProdID column in the ordersFinal table, hold down and drag the mouse cursor over the table column in the products table, then release the mouse button. Select Close.
10. Add to the query the following columns in this order: from the SUPPLIERS table: Country, CompanyName and City, from the products table: ProductName, QuantityPerUnit, UnitCost and CategoryName, from the ordersFinal table: Quatity, SellPrice and Revenue
11. Calculate using the advanced expressions editor, a new column called Profit, using the formula: (SellPrice-UnitCost)*Quantity.
12. Set for the Profit column: type Currency, display format DOLLARw.d with width 8 and 2 decimals.
13. Filter data to include only providers from Europe
14. Select Run to execute the query and exist Query Builder.
15. Analyze, using a Pie chart, which supplying European countries brought the highest values of profit (both percentage and absolute value) for the ABC company.
5. Parameter queries
Exercise 5: Starting from the Profit table created in the previous excercise, use a query to create a SAS table containing the total profit generated by each supplier. The query must contain a parameter for the Profit column. Based on this parameter, create a filter so that only suppliers that have generated higher returns than a user-specified value will be included in the query. Then use List Data processing to
Follow these steps:
1. Open the Profit table. In the menu bar above the Data grid select Describeð Summary Statistics Wizard .
2. At Step 1 do not change any options.
3. At Step 2 select as analysis variable Profit and as grouping variables Country and Company Name.
4. At Step 3 check Show Statistics and Save statistics to data set options. Press the Edit button in the top right of the window and select only the options for calculating the average and the amount of profit. Click the Browse button and rename the created report TOTAL_PROFIT.
5. At Step 4 do not change any options.
6. Select the newly created table that includes statistics. Create a query called Profit Limit Query for the table, by pressing Query Builder.
7. Press Prompt Manager to create a parameter, Add.
8. In the General tab, type in the Name field: Profit_Limit and in the Display test field: Display Suppliers with Total Profits Exceeding $.
9. In the Prompt Type and Values tab, select in order the options: Type numeric, User enter values, Single Value, Allow only integer values. Default value: 0. For the other options do not specify anything. Press OK.
10. Set a filter for the Profit_Sum column based on value of Profit_Limit parameter. Select the Filter Data tab and drag the name of the Profit_Sum column from the left to the filter surface.
11. In the New Filter dialog box, select the Greater Than operator. Tick Generate filter for a prompt value. In the Value field select the icon , then the Prompts tab and Profit_Limit variable. Click Next and Finish.
12. Select the Select Data tab and add to the query the fields: Country, CompanyName, Profit_Mean şi Profit_Sum.
13. Change the query options so that they return a report. Press Options, Options for This Query the Results panel, Results format, select Override the Corresponding default... and select Report.
14. Select Run. When the window to prompt for the parameter appears, enter 5000 and select Run. Only suppliers whose products have generated a profit tax higher than $ 5,000 are included in the report
15. Additionally: Change the query which creates the Profit table, to display all suppliers, not only those in Europe. Then, under the process flow (Process Flow), select all the objects that are created from the Profit table and run the selected sequence of the project (Run button or press F8). Analyze the results.
6. Outer joins
Exercise 6: The ABC Company wants to identify orders placed for products that can not longer be ordered from suppliers because they are out of production. When you are out of production, products are removed from the products table. Based on tables ordersFinal and products, perform a query to create a virtual table containing orders placed for products that can no longer be ordered from suppliers.
Follow these steps:
1. Starting from the products table create a query to which you add the ordersFinal table.
2. Change the default join type and select an external join that displays orders placed for products that cannot be ordered from suppliers because they were out of production.
3. Add to the query the following columns ProductID, ProductName, ProdID, Quantity, SellPrice and Revenue.
4. Set options for the query to have as result a virtual table. Press Options, Options for This Query, Results panel, Results format, tick Override the corresponding default… and select Data view.
5. Rename the query Discontinued Products Query.
6. Apply a filter to the data to include only the orders whose products have been removed from the products table.
7. Save the project.