CSE 5331/4331 Summer 2015
Project 2
In this project, you will learn to use MongoDB as an example of a document-oriented NOSQL system, and see how data is stored and queried in such a system. You will also learn about the difference between storing data in a flat (relational) format versus in a document (complex object) format.
The input to your program will be several data files in flat relational format for the Soccer World Cup 2014 database. The schema diagram for this database is in the following:
COUNTRY
Country_Name / Capital / Population / No_of_Worldcup_won / ManagerPLAYERS
Player_id / Name / Fname / Lname / DOB / Country / Height / Club / Position / Caps_for_
country / Is_captain
MATCH_RESULTS
Match_id / Date / Start_time / Team1 / Team2 / Team1_score / Team2_score / Stadium / Host_city
PLAYER_CARD
Player_id / No_of_Yellow_cards / No_of_Red_cardsPLAYER_ASSISTS_GOALS
Player_id / No_of_Matches / Goals / Assists / Minutes_PlayedWorld_cup_History
Year / Host / WinnerYou will need to design two document (complex object) schemas corresponding to this data:
1. The COUNTRY document will include the following data: Cname, Capital, Population, Manager (of the national score team), and a list of the players {players: Lname, Fname, Height, DOB, is_Captain, Position, no_Yellow_cards, no_Red_cards, no_Goals, no_Assists}, plus a list of World Cup won history{Year, Host}.
2. The STADIUM document will include the following data: Stadium, city, and a list of the matches of the stadium {Match: Team1,Team2,Team1Score, Team2Score, Date}
The GTA will post on the Web site instructions on how to download and install MongoDB on your computer.
We will have data files for the Players, Player_Cards, Player_Assists_Goals, Match_results Country, and Worldcup_History tables posted on the Web site in relational (flat file) format.
Your tasks for the project are as follows:
1. Install MongoDB on your computer.
2. Write programs to extract the data needed for the two document types above (COUNTRY and STADIUM) from the relational data files, and load these documents into the MongoDB system.
3. Write some MongoDB queries to retrieve some of the stored documents.
Due Dates: You should turn in an intermediate report by Tuesday, July 28 (11.59pm) that includes your preliminary design of the program psuedo-code (high-level code description) and the data structures that will be used. You must also demonstrate that you have installed MongoDB by creating a simple document (object) and turning in documentation to show this.
The complete project is due 11.59pm, Thursday, August 6. A late penalty of –5% per day late will be assessed. You should turn in: (i) Sufficient documentation that provides information on the design and implementation of the program. The updated psuedo-code and a detailed description of each data structure used. (ii) An output file should be turned in that shows your queries and the query results. (iii) You should also turn in the source code for your program with sufficient internal documentation (comments) for the GTA to understand and execute your program. A demo will be required.
Note 1: This project can be done in groups of up to 2 students per group. All students in a group will receive the same grade. Note 2: A number of queries will be provided one week before the program is due. You should run all the provided queries and turn in the result for each query.