CSE 5331/4331 Summer 2015

Project 2

In this project, you will learn to use MongoDB as an example of a document-oriented NOSQL system, and see how data is stored and queried in such a system. You will also learn about the difference between storing data in a flat (relational) format versus in a document (complex object) format.

The input to your program will be several data files in flat relational format for the Soccer World Cup 2014 database. The schema diagram for this database is in the following:

COUNTRY

Country_Name / Capital / Population / No_of_Worldcup_won / Manager

PLAYERS

Player_id / Name / Fname / Lname / DOB / Country / Height / Club / Position / Caps_
for_
country / Is_captain

MATCH_RESULTS

Match_id / Date / Start_
time / Team1 / Team2 / Team1_score / Team2_score / Stadium / Host_city

PLAYER_CARD

Player_id / No_of_Yellow_cards / No_of_Red_cards

PLAYER_ASSISTS_GOALS

Player_id / No_of_Matches / Goals / Assists / Minutes_Played

World_cup_History

Year / Host / Winner

You will need to design two document (complex object) schemas corresponding to this data:

1.  The COUNTRY document will include the following data: Cname, Capital, Population, Manager (of the national score team), and a list of the players {players: Lname, Fname, Height, DOB, is_Captain, Position, no_Yellow_cards, no_Red_cards, no_Goals, no_Assists}, plus a list of World Cup won history{Year, Host}.

2.  The STADIUM document will include the following data: Stadium, city, and a list of the matches of the stadium {Match: Team1,Team2,Team1Score, Team2Score, Date}

The GTA will post on the Web site instructions on how to download and install MongoDB on your computer.

We will have data files for the Players, Player_Cards, Player_Assists_Goals, Match_results Country, and Worldcup_History tables posted on the Web site in relational (flat file) format.

Your tasks for the project are as follows:

1.  Install MongoDB on your computer.

2.  Write programs to extract the data needed for the two document types above (COUNTRY and STADIUM) from the relational data files, and load these documents into the MongoDB system.

3.  Write some MongoDB queries to retrieve some of the stored documents.

Due Dates: You should turn in an intermediate report by Tuesday, July 28 (11.59pm) that includes your preliminary design of the program psuedo-code (high-level code description) and the data structures that will be used. You must also demonstrate that you have installed MongoDB by creating a simple document (object) and turning in documentation to show this.

The complete project is due 11.59pm, Thursday, August 6. A late penalty of –5% per day late will be assessed. You should turn in: (i) Sufficient documentation that provides information on the design and implementation of the program. The updated psuedo-code and a detailed description of each data structure used. (ii) An output file should be turned in that shows your queries and the query results. (iii) You should also turn in the source code for your program with sufficient internal documentation (comments) for the GTA to understand and execute your program. A demo will be required.

Note 1: This project can be done in groups of up to 2 students per group. All students in a group will receive the same grade. Note 2: A number of queries will be provided one week before the program is due. You should run all the provided queries and turn in the result for each query.