11-4-2009
Exercise 3A: Data Set Up
The goal of this exercise is to give you familiarity with named ranges and dynamic named ranges for your data as a starting point for graphing and analyzing data.
The example we will use is from a site with some very high PM10 concentrations, over about 6 months. All the QC checks, calibrations, audits, etc. were satisfactory, so the PM10 levels should not be discounted as errors. Furthermore, this area is very dry, dusty, and windy, with these conditions lasting all year, so the high levels should also not be necessarily reported as unusual events. Eventually, we will evaluate whether the high PM10 levels are correlated in time with other factors which would lend credence to them, such as high wind speeds (there is already extremely low precipitation in this area). For now, the exercise in part 3A shows how to set up data with named ranges so that subsequent plotting and analysis is faster.
3A.1. Named Ranges
Open the spreadsheet Exercise 3A-Data SetUp.xls, sheet DATA!, and click on Insert > Name > Define:
In the Names in workbook pane, enter TEOM_WS (note that you can copy and paste from this doc into dialog boxes using ctrl-C and ctrl-V).
In the Refers To pane, click on the colored square to minimize the dialog box (this allows you to select the range in the sheet), and select the entire range by clicking on cell A1, holding down shift-ctrl, and pressing the right arrow, then the down arrow to select the entire range of filled cells. Click on the colored square again to re-enter the dialog box, then click OK.
This has given you a named range which you can now refer to more easily than the row-column name in formulas and charts.
To verify that this name exists, and to see the range to which it refers, click the down arrow in the name box, and all names in the workbook (but not dynamic names, which change their range as you add or remove data) will show up in the Name box, and you can click on the name and that range will be selected:
When the name in the box is highlighted, the range will become selected:
Now this range can be referred to in calculations or charts.
3A.2. Using Names in Chart Wizard
To see how this is done, click on the chart wizard icon, line chart,
And when prompted in the next screen for data range, click on Insert, Name, Paste,
And paste in the name of your range:
Create a line chart of the data following these screen shots and put the chart in Sheet 1.
Now, look at your chart which should look something like this—ugh!:
3A.3. Chart: Adding Secondary Axis
Because of the different scales of the PM10 and wind speed, it is impossible to show both data on the same chart unless two axes are used. To create the 2nd axis, pull the table open enough to allow you to hover your cursor over any wind speed point (these wind speed points will be easiest to find in a place where there are fewer PM10 points):
Right click, and select Format Series:
Click on Axis, Secondary Axis:
And Patterns, Markers, None, so that you can better see the data:
Now your chart should look something like this, (and we will leave this chart as it is for now):
(You can verify that there are missing data at the end of July thru the beginning of August, as can be seen by scrolling thru the data.)
(Do not do this now, but if you do not want a line plotted when there is missing data, the easiest thing to do is to insert the missing days, with no values in either the PM10 or wind speed cells. This can be done easily by inserting the missing number of rows and using Autofill to add the dates. Then select the chart, click on Tools, Options, and make sure plot empty cells – not plotted, leave gaps, is selected.)
Save the spreadsheet onto your USB drive with the filename Exercise 3A-SetUp-YOURINITIALS.xls.
3A.4. Dynamic Named Ranges:
As your data change, and you want to create the same summary statistics or charts each month or quarter, but your data increases in rows or columns, how can you include the changing range without changing all your formula and chart references?
Use Dynamic Named Ranges, that contract or expand with the data.
Using the OFFSET Function
Open the sheet called OFFSET. Enter values between 1 and 15 in cell F2 to evaluate which values are returned in cell E2. (The formula is in cell E3 is text.)
OFFSET returns a reference to a range that is a specified number of rows and columns from a cell or range of cells. The reference that is returned can be a single cell or a range of cells. You can specify the number of rows and the number of columns to be returned.
Syntax: OFFSET(reference,rows,cols,height,width) (height and width are optional)
This can be useful if you want to find the value corresponding to some number of hours (rows) ahead, which you enter in cell F2, that is 1 column to the right of the target (reference) and the number of hours (rows) below the target cell. (Start counting with the reference cell as zero.)
This returns the value in cell B6.
If height or width is omitted, it is assumed to be the same height or width (rows and columns) as the reference cell or range.
COUNTA counts the number of rows that you have put in this sheet, and you can insert that into the offset function to create a dynamic range.
Note: don’t count the entire column unless you think you have close to 65,000 rows because this can slow down or freeze your computer. Also, use countA so that you include error values and do not end up rows short of the end of your actual data. (CountA also counts formulas.)
In the DATA sheet, Insert > Name > Define:
In the Names in Workbook you will see the previous name that you defined, and now you will add a range that will change to include new data (rows and columns) that you add to the existing data.
In the name box type AllDates. In the Refers To box, type in
(NOTE: You can cut and paste from this word document by using ctrl-C and ctrl-V. ALSO, do not use commas in numbers in formulas because those are Excel’s argument delimiters)
=OFFSET($A$1,0,0,COUNTA($A$1:$A$10000),1)
Then click Add, then OK.
You can verify that the correct range is returned by clicking in the Name box and entering the name you just entered and excel will select the range:
The entire column of data is selected, with all the dates included. If you add dates in the middle, beginning or end they will be included. (You can also check names by selecting Edit > GoTo.)
To create a dynamic table, stretching across as many columns as you have data for, use the countA function again, counting across columns also. (Note that this method assumes that column A will set the max table height, which will be the case if column A has dates/times. If your table has columns other than column A setting the max table height, call me.)
Insert > Name > Define a new name called AllData:
In the Refers To box, type in (or copy from here):
=OFFSET($A$1,0,0,COUNTA($A$1:$A$10000),COUNTA($1:$1))
Now check this range by typing in AllData in the name box:
This selects the entire range. To test this, add a few entries in column D and see how the range changes (note that you must have a header in row 1 so that all columns are counted):
When you have tasks such as graphing data each month, but the number of rows or columns change, or you keep adding to it, and you have an associated graph that you want to keep consistent as your data change, you can keep a sheet such as this for your target sheet. Save this sheet as Exercise 3A-AllData-YOURINITIALS.xls on the jump drive.
3A.5 Using Filtered Ranges in Charts and Functions (recalculate with hidden cells eliminated)
In order to have charts and functions (such as =slope(Ys, Xs), etc.) calculate based on filtered columns, you need to do something extra. The excel SUBTOTAL function allows you to calculate other things than subtotal, specifying whether or not you want hidden values to be part of the calculation. (This is excel help topic HP10062463.) The syntax is =subtotal(function_num, range) where the function_nums are listed below and the range is what you want to calculate.
Function_num(includes hidden values) / Function_num
(ignores hidden values) / Function
1 / 101 / AVERAGE
2 / 102 / COUNT
3 / 103 / COUNTA
4 / 104 / MAX
5 / 105 / MIN
6 / 106 / PRODUCT
7 / 107 / STDEV
8 / 108 / STDEVP
9 / 109 / SUM
10 / 110 / VAR
11 / 111 / VARP
So in our example, if we have a set of data (see sheet Offset-2) that we want to filter and have a chart and functions that automatically recalculate based on only the VISIBLE values (leaving out those that have been filtered out) then use =subtotal(1,range) for example, which will calculate the average of the visible values in that range.
The following data is on sheet Offset-2 of the Exercise 3.xls file:
The chart was made by creating an x-y (scatterplot) chart, as follows:
Right click the blank chart and Select Data:
Add a series:
Type in a name for the title, and select the date column for the x-axis, and select the Concentration column for the Y values, and since you created a dynamic named range for the WS you can insert that by clicking on Formulas, Name Manager, Paste Into Formula, Paste Names, like this:
And the same for the Y values, making them the filtered concentration column using the dynamic named range:
So the whole series plotted looks like:
Now add a trendline by either right-clicking the SERIES:
Or going to Chart Tools, Layout, Add Trendline. Be sure to check More Trendline Options, and:
This produces a chart like this:
And when you filter out some dates or values the trendline automatically recalculates. However:
Exercise 3A-1
.