Airline Gas Price Analyzer

Airline Gas Price Analyzer

Airline Gas Price Analyzer

Yudith Carmazzi

Sulakshman Madala

Vikas Reddy Vellakonda

University of Houston – Clear Lake

November 11, 2011Page 1 of 6Airline Price Analyzer

Abstract

The airline industry faces many challenges due to the rising cost of crude oil. The use of a data warehouse provides the ability to make decisions in order to maintain competitive advantage.

Introduction

Gas prices have been fluctuating a lot over the past 5 years as a result of the constant increase in the international crude oil prices. The airline industry has been facing with many challenges. Not only has the rising price of crude oil been an issue, but the increase competition among the airlines. Like any other major industry, the survival of the Airline industry depends up on keeping themselves in top with their competitors. In addition the slow economy has forced the airlines to think of alternatives to increase revenue. Many of these alternatives have been introducing luggage fees, decreasing the ticket price to lure more travelers, providing services to location where competitors might not have service, etc. Airline industries use Data warehouses in order to analyze trends and making predictions in order to stay competitive.

A Case Study: Airline Gas Analyzer

Business scenario

VYM is Leader in Developing Customized decision making systems for the Business. The company started in 1992 at Houston, TX with 25 employees. Has aglobal footprintwith 4 offices and development centers in US, India and Singapore. VYM currently has over 200 employees. Our major client base includes governmental agencies, airline industry, consumer watchdog groups and airline customers. Due to the raising crude oil prices, a major airline industry hired our services to gather and analyze data by using a data warehouse and its analysis tools.

Problem statement, data origins, and expected value

Why Data Warehousing?

It is well known fact that gas prices have been raising through time. In order to analyze the data through time and evaluate different business dimensions, a data warehouse was built. Our Data warehouse is designed to answer the following questions:

  • How Oil prices affected the number of travelers
  • Prices for most frequent flight route
  • The most traveled routes
  • Prices for most frequent days traveled
  • Least and Most efficient fuel efficient planes
  • Most traveled airlines
  • Most Traveled Holidays

The Online Processing Analytical Processing (OLAP) tools available will allow us to analyze the data in different angles to derive meaningful information for our clients.

Methodology

Our Airline Gas Analyzer Data warehouse was first approached by gathering historical data on gas prices, airline passenger data, airport locations and miles and airline plane size. Data elements were evaluated and separated into dimension tables and fact table. Initially, a STAR schema was proposed and eventually evolved into a Snowflake schema. Once the dimensional model was finalized, several SQL statements were developed. These SQL statements create tables, primary keys and foreign keys and links among them the tables in the SQL Server. Once the tables were created, data was cleansed, formatted and integrated into these tables. A Visual Studio 2008 – Analysis Service Project was created. Connection to the SQL database was made and a data source view was created. In addition, dimensions and cube were developed.

Dimensional Modeling

Business dimension are fundamental when defining data warehouse. Each type of business has a different business dimensions (Ponniah 103). When gathering information we analyzed data element categories and hierarchies for the business dimensions and the measurements that became part of the fact tables. We developed an informational package diagram shown in table 1.

Dimensions
Travel Time / Flight Routes / Airplane / Carrier
Year / Departure Airport / Passenger Capacity / Carrier Name
Quarter / Arrival Airport / Gallons to Fuel
Month / Airplane Maker
Day Week
Holiday
Day
METRICS: Number of Customers Travelled, Cost of airfare, Cost of Crude oil , Travel miles

Table 1. Informational Package Diagram

Dimensional modeling is a design technique to structure the business dimensions and metrics (Ponniah 226).

Initially,we developed a STAR schema where several dimensional tables are linked to one or many fact tables. Dimensional tables have attributes and hierarchy. Fact tables contain the measurements. The STAR schema is shown in Figure 1.1

Figure 1.1 – STAR Schema

A Snowflake Schema was later developed to normalize the dimensional tables. This schema allows greater efficiency and organization to the data model. Snowflake Schema is shown Figure 1.2

Figure 1.2 - Snowflake Schema

In addition to the modeling, a data dictionary was built. The data dictionary contains the attributes for each table, the attribute’s data type, keys and size. The data dictionary is shown in Figure 1.3

Figure 1.3 – Data Dictionary

Data Staging

After Developing the Star and Snowflake schema, data was staged for data warehousing. Data staging involves data extraction, data transformation and data loading steps.

Data Gathering and Extraction

Data was gathered from the following internet locations:

Airfare, Airline and Route data –Research and Innovative Technology Administration - Bureau of Transportation Statistics (RITA) (

Crude Oil Price Data - U.S. Energy Information Administration (

Flight Schedules – American Airlines

(

Data was downloaded into spreadsheets or PDF files.

Data was downloaded into spreadsheets or PDF files. Figure 1.4 shows a snapshot of the crude oil price data collected.

Figure 1.4 – Crude Oil data snapshot.

Data Transformation

Bad data leads to bad decisions (Ponniah 128). Data Quality should be a priority when performing data transformation. Data was cleansed and prepared for data extraction. Missing data was populated manually. Data was formatted consistently and keys were created.

Data Loading

The data loading step was performed by using SQL queries. These SQL queries created the dimension and fact tables and loaded the data from the spreadsheet into tables at the SQL server domain. An example of a query can be seen in Figure 1.5

Need screenshot

Figure 1.5 SQL command

After the data was loaded and tables created, relationships between the fact and dimension tables were built. These relationships were built according to the Snowflake schema. Figure 1.6 shows the data warehouse data model.

Figure 1.6 - Data Warehouse Data Model.

Information Delivery

Once the SQL server database is built, we must use a tool that will allow us to deliver the information to the user. We build the data warehouse by constructinga cube or multiple cubes and the use of Online Processing Analytical Processing (OLAP) tool to analyze the data. The OLAP tool provides the user to slice, dice or rotate the data, roll-up or drill-down the data.

Cube Implementation and OLAP

The cube is constructed by combining a set of dimensions and several measure groups (Cameron 22). In order to analyze a specific dimension, this dimension must be stored in the Cube. The Cube provides the ability to store data efficiently and being able to retrieve the data faster.

In order to build a cube, we deployed the SQL Server Business Intelligence software tool. Create a New Project and link to the created database located in the SQL Server. Once the data source is available, a Data Source view must be created. A data source view is a logical data model that exists between your physical source database and Analysis Service dimensions and cubes (Cameron 42). Additional dimensions can be created with few attributes of the original dimension table. Then, the newly created dimension can have the attributes organized in hierarchies. In our airline data warehouse, the travel time dimension had hierarchies as shown in figure 1.7.

Figure 1.7 – Travel Time hierarchy

Then, the cube is constructed by selecting the fact table measurements and the dimensions. Figure 1.8 shows the cube implemented for this project.

C Users madalas4632 Desktop New folder Cube Screenshots Cube JPG

Figure 1.8 - Cube data model.

After the cube is constructed, the project is deployed. Once deployment is successful, we can browse the cube. To browse the cube, we can add in the left and top side the dimensions and in the center the measurements like it is shown in Figure 1.9.

Figure 1.9 – Location to add dimensions and measurements for cube browsing.

One of questions we needed to answer was the price of airfares per route. Figure 1.10 shows how routes airfare changed through time. In this case, we wanted to view how airfares changed by C Users madalas4632 Desktop New folder Cube Screenshots Fares per Route JPG
Quarter and Month.

Figure 1.10 – Cube browsing for route airfare cost by quarter and month.

Conclusion and Discussion

Data warehousing provides the user the ability to view and analyze data. Developing a data warehouse is not an easy task. Data warehouse creation involves several steps from data analysis and modeling, data gathering and extraction, data transformation, data loading, cube building, deployment, and browsing. One of the most time consuming portions of building a data warehouse is the extraction, transformation and loading of data. Once the cube is successfully deployed, data browsing and analysis becomes an easiertask by dragging and dropping the dimensions and measurements. With the OLAP tools we were are able to determine who was which route was most expensive or themost travelled route for a time period, which airline was most costly per route for a time period, etc. The ability for users to choose the data to be analyzed without having to write complicated queries is one of the most important features for a data warehouse. In addition, a data warehouse allows making predictions of future data by looking at trends. For the airline industry, a data warehouse becomes extremely useful since the airline will be able to predict fuel cost and make business decisions to offset operational costs. In the competitive business world, data warehousing will become a necessity for business survival.

References

Ponniah, Paulraj. Data Warehousing fundamentals for IT professionals. 2nd ed. Canada: John Wiley & Sons, Inc., 2010

Cameron, Scott and Hitachi Consulting. Microsoft SQL Server 2008 Analysis Services – Step by Step. Microsoft Press, 2009.

November 11, 2011Page 1 of 6Airline Price Analyzer