Chapter VIII

Data and file managementfundamentals

Database fundamentals

Information security fundamentals

File fundamentals

1-What is a file?

  1. A file is the smallest named collection of data/information or instructions stored on a storage medium.
  2. A file must have a name before being saved.
  3. File name must be descriptive to indicate the content of the file andmust abide by the following criteria:
  4. Maximum length shouldn’t exceed 255 characters
  5. Avoid prohibited characters *?”/I<:\
  6. Case sensitive: “A” and “a” are different characters in the file name.
  7. The file name must end with an extension that is related to the format used when the file was saved. (.txt, .doc, .gif, .xls, etc…)
  8. Native file format is format used when we create the file: Example (if we use MS word to create a document then we save the document as a pdf file its native file format will be .doc and its filename extension will be .pdf)
  9. File name extension cannot exceed 3 characters.
  10. Operating system can be set to hide the file name extension

2-Types of files:

  1. Program file or software file:
  2. consists of software instructions designed to instruct the computer how to perform specific applications
  3. an application may consist of one or several programs
  4. all program files are saved on the hard disk in a folder called “program files” folder which is the default folder used by the operating system to store software programs including the ones that you download from the WEB.
  5. Program file can be ASCII text file that needs conversion to binary or binary executable file that don’t need any conversion.
  6. Data files:
  7. Files that hold data/information of all types
  8. No instructions included in data files.
  9. Content vary from text to document to multimedia (pictures, audio, video, etc…)

3-Physical storage model:

  1. Each storage medium is formatted and divided into sectors
  2. CD has one single track about 3 miles long: 336000 sectors of 2048 bytes each 336000 x 2048 = about 700 megabytes
  3. The smallest storage location is not the sector but the cluster
  4. A cluster is formed of several contiguous sectors.
  5. Operating system stores each record of a saved file in an available cluster and gives it a flag and order number in the corresponding file.
  6. Consequently, a given file has its records dispersed all over the storage medium
  7. Operating system maintain a list of addresses of files in a FAT (File Allocation Table) that serves like the occupancy board at the entrance of office buildings.
  8. Defragmentation utility brings same file records as close to each other as possible
  9. The FAT addressing technology helps the operating system to retrieve the file when we want them to be displayed.
  10. NTFS (New Technology File System) and HPFS (High Performance File System) and IFS (Installable File System) are newer more advanced technologies than FAT which is still used because of its effectiveness.

4-File storage and retrieval (Logical storage model)

  1. The operating system creates a directory for each storage medium and maintain a table of addresses for all files stored on that directory
  2. The directory is considered to be the logical storage model that identifies the path to each file thru directories and subdirectories.
  3. This logical storage model is like a tree metaphor that consists of:
  4. Root directory: The storage medium depicted as the trunk of the tree. Example for the hard disk (C:\)
  5. Subdirectories: considered as the folders and subfolders depicted as the branches.
  6. Files depicted as the leaves
  7. Example: C:\electronics\computers\notebook\apple.exl
  8. The above example in computer store represents the path of the available apple notebooks excel file stored in the notebook subfolder which is stored in the computer subfolder which is also stored in the electronic folder on the hard disk.

5-In the screen shot above of windows explorer:

  1. The folders are represented by a folder shape followed by the folder name
  2. The files that are not saved in folders are listed below the folders with their file name extension
  3. The date is the date when the file was created or last time updated
  4. The volume represents the actual byte count of each file that corresponds to the listed date.

6-File management software (Windows Explorer or Mac finder) allows you to manipulate files and folders in the following ways:

  1. Rename: Change the name use (Save as)
  2. Copy: make a copy of the file so you can paste it in another location
  3. Move: You can move the file from its actual location to another and change its logical storage model accordingly.
  4. Delete: Move the file to the recycle bin folder

2-MOVE/DELETE files:

  1. When you move a file from one location to another all its bits stay where they were until they are overwritten by other files.
  2. Only the status of the corresponding clusters will be turned from occupied to vacant so they will be used by the Operating system to store other files.
  3. When you delete a file, you in fact, are moving it to the recycle bin folder where it will stay and can be retrieved as long as that folder is not emptied or overloaded.
  4. File shredder is software that overwrites the old file ASCII code using random zeroes and ones.

Database fundamentals

1-What is a database?

  1. A database is a collection of related data files that consists of all data/information of a business or organization.
  2. Database files are also called tables or relations because they are tables similar to spreadsheet files.
  3. In other words Databasemay be defined as a collection of related files used as a centralized homogeneous source of data/information used by many users in a business and is very flexible to allow the following procedures:
  4. Collect and store data
  5. Update data
  6. Organize and output data
  7. Find and analyze data

2-Data warehouse and data-mart

  1. If the number of files in a business or organization is too big, which is usually the case in big businesses, then several databases will be necessary and a data warehouse is created which consists of a collection of several related databases.
  2. A small data warehouse is known as a data-mart.
  3. Creation and manipulation of database is not possible without powerful software known as DBMS (Database management software). MS Access is a database management system.

3-Functions of DBMS

  1. Helps create the database files or projects
  2. Helps manipulate the database: update information, add new information delete and all necessary operations.
  3. Sorts data based on given criteria.
  4. Provides interface between the user and the database thru 2 applications;
  5. Front end application interface between the user like the forms and direct links provided by the application
  6. Back end application interacts with program and applications of the database that are used by the users.
  7. Prepare routine tasks using available data:
  8. Paychecks for employees
  9. Issue letters and labels and other promotional material.
  10. Tax forms
  11. Client and supplier accounts
  12. Etc…
  13. Helps decision makers with the decision process by providing reports and statistics needed for that purpose.
  14. Provides for possibility to query the database by authorized people in order to get needed information and data.
  15. Helps in enhancing data security by protecting data from intruders, attacks and all unauthorized queries.

4-Database structure:

  1. Field: is the building bloc of the database and its data. It is the smallest data element that must have well defined characteristics:
  2. Length: how many characters and if it is fixed or variable.
  3. Type: Alpha, numeral, decimal, currency, date, etc…
  4. Never store last name and first name in the same field.
  5. Each part of the name must have its own field. Example the name Dr. Maya N. Abdallah Jr. must be stored in 5 fields.
  6. Each data element that fits in a field is known as the field attribute.
  7. Record:
  8. The record is a collection of related fields.
  9. If the number of fields is too big the record will be very bulky and we may need to create another record and consequently another file.
  10. Table or relation
  11. The file in a database is known as a table or a relation.
  12. It consists of a collection of many related records
  13. Files in the database have relationships necessary to extract all information we need from different files.
  14. Each table, regardless of its size, is composed of:
  15. One record type which consists of the record template of all the labels of the record.
  16. A number of record occurrences equal to the number of the population of the file
  1. Database structure is depicted in the following database sample of a department store where the database is composed of 5 tables as follow: Clients, Suppliers, Employees, Inventory, Sales
  2. Employees file has a sample depicted in the table below:

ID # / Last name / First name / suffix / Department # / Date of hire
123456 / Smith / Ted / Sr / 22 / 050508
456789 / Rogers / Bill / 19 / 022599
157157 / Salam / Ziad / Jr. / 18 / 031608
989898 / Jolie / Raya / 22 / 111210
  1. The record type or is the first row of the table (the red labels row)
  2. Each entire record (row of the table) filled of information is an occurrence.
  3. Each cell of the row record is a field and data in each field of the record is the attribute of that field.
  4. The name Ted Smith Sr. as you see was stored in 3 fields (one for each part of the name) and the file designer must take into consideration the longest names so that he can add the required fields that satisfy all the population of the file.

5-Keys:

  1. Database manipulation and data retrieval from a database will not be possible without a key that identify data and information stored into the database.
  2. There are two types of keys:
  3. Primary key: which is a field unique to each record (PID, SSN, ID#, etc…) database designer must be very careful when defining primary keys especially for inventory items.
  4. Secondary key: May be any other field of the file.
  5. In the sample file above:
  6. ID# is the primary key that is unique to each employee (it is not possible that 2 employees get the same ID# in the same business)
  7. All other fields of the record type may be used as secondary keys used when needed to extract information. Example: We may use the date of hire as key to get a list of all employees that were hired in 2008 for example.
  8. Secondary keys are mostly used to get lists and reports following specific criteria presented or previously scheduled by authorities that demanded the reports and they are subject to many updating
  9. Primary key, on the other hand, is normally permanent information that is not updated or rarely updated.
  10. As we may notice sometimes secondary keys are used much more than primary keys that may end up being used in operations that follow the application of secondary keys.

6-Database relations

  1. All modern databases are relational databases which means that all the files tables of the database have one or more common field with other table or tables depending on the type of the relationship that may be classified under 3 types:
  1. One to one: when the table has only one common field with another table
  1. One to many:

One supplier supplies many products

(Business uses only one supplier for a part of its products)

One to many

  1. Many to many: One supplier supplies many products and a product is supplied by many suppliers: (when many suppliers supply the same products to the same company)

Many to many

7-Data warehouse and data-mart:

  1. If the number of files in a business or organization is too big, which is usually the case in big businesses, then several databases will be necessary and a data warehouse is created
  2. Data warehouse consists of a collection of several related databases that form a big multidimensional database ad cover all files in multidimensional establishment.
  3. A small data warehouse is known as data-mart.
  4. They can be depicted as a cube shape where each separate cut represents a database.

8-Database operations:

  1. Normalization: process of eliminating redundant data which results in reducing the size of the database.
  2. Selection: process of selecting from the database records that meet given criteria. Example clients whose accounts exceed $ 5000.
  3. Joining table: to shrink the number of tables or files in a database we may join 2 or more tables together to form one table.
  4. Data-mining: Finding hidden relationships between data in several databases of the data warehouse
  5. Data dictionary: represents a perfect description of the database design and all data fields and their characteristics.
  6. Database schema depicts the database structure written in plain English or any other language. It complement the data dictionary in explaining the content of the database.

9-Object oriented database (OODB)

  1. consists of classes of objects and subclasses or sub-objects: Example:
  2. Transportation: air, sea, ground- rail road, automobiles- buses, trucks, etc…
  3. The most important advantage of OODB is its reusability (it can be treated like a template

10-Querying database:

  1. SQL (Structured Query Language) is used in forms that are converted into queries by the DBMS. Example: grade form on CGS2100 website.
  2. SQL used keywords are:
  3. CREATE
  4. DELETE
  5. INSERT
  6. JOIN
  7. SET
  8. SELECT
  9. UPDATE
  10. Example: SELECT 32” TV FROM electronics Where TV Brand = Sony
  11. SQL also allows the use of logical operators: AND, OR, NOT
  12. The grade form on the course website: When you fill and submit it will be converted into SQL to select your grade page from the database of grades.
  13. Query by objective:
  14. Used to get lists by categories of data: Example (seafood, meat, poultry, fruit, vegetables, etc…)
  15. Used in Object Oriented Data Base (OODB) where data consists of classes of objects and subclasses or sub-objects and where the query will result in lists of data objects and sub-objects meeting selected data criteria.

11-Designing user interface

  1. Forms are the most used user interface they should be designed carefully to make it user friendly as much as possible. Forms must be:
  2. Fields arranged in a logical order
  3. Boxes areas should be clear, visible and consistent with data to fill
  4. Provide easy samples with instructions about the filling of data.

12-OODB (Object Oriented Database)

  1. Data is stored as objects that are grouped into classes and subclasses
  2. OODB is reusable and portable because:
  3. Functions and application methods are defined with each object and can be reused by all subclasses.
  4. You only need to add your special parameters and the model will work for you if you belong to the same class activity.
  5. Example: If transportation network is an object: there are general functions and attributes that are valid for all kind of transportation classes:
  6. Air transportation
  7. Water transportation
  8. Ground transportation
  9. In Ground transportation there are general functions and attribute that are valid for all forms of ground transportation
  10. We don’t have to start always from scratch when building a database
  11. Another example is Grocery: meat, seafood, poultry, fruit, etc…
  12. Classes include data and functions that manipulate data, and lower subclasses inherit these functions from higher classes
  13. OODBMS (Object Oriented Database Management System) is needed to manipulate the OODB
  14. Access can manipulate all types of databases including OODB.

13-Management approach using files or database pros and cons:

  1. File approach: each department will create and maintain its own files
  2. Strengths:
  3. More security
  4. Limited and clear responsibility
  5. Close knowledge of the subject matter of data/information
  6. Weaknesses:
  7. Redundancy: same record is repeated in many departments.
  8. Limited involvement in the general business picture
  9. No or limited networking possibility
  10. No cooperation between all business department
  11. Poor quality reports and decision making information tool.
  1. Database approach:
  2. Strengths:
  3. No redundancy
  4. Centralized source of same information among all departments
  5. Networking heaven because of database server that will be able to provide information to all clients
  6. Security is enhanced thru strict central policy and limited people are manipulating the database.
  7. Much better quality report and customer and supplier service.

Malware and computer security

Threat to information

1-Why information system is always under attack?

  1. As we have illustrated in previous chapters, business executives believe that the information they possess and use throughout their business information system is probably the most valuable asset they have.
  2. Since Internet includes almost all media used to exchange and manipulate information, Corporate and government networks being de facto parts of the Internet, found themselves under permanent attack, and the Internet is the battlefield
  3. Many types of attacks can be made on computer systems
  4. Malware: Viruses, worms and Trojan horses
  5. Identity theft
  6. Theft of personal information
  7. Unauthorized use of other’s computer

2-Origin of the threat may be attributed to the following:

  1. Business intelligence is the process of gathering and analyzing information about themarket and their business competitors. Counterintelligencemeasures may protect from this threat or at least minimize it.
  2. Hackers and intruders willing to get possession of valuable information and sell it to competitors making big bucks and they always find their way using software and network vulnerabilities resulting from bugs (software holes)
  3. Software bugs or security holes allow violations of information security.
  4. These bugs are usually dealt with by the OS provider by extracting them and usingfrequent update processes to fill the holes created after their extraction.
  5. The update is achieved by using patches (healthy programs that replace the extracted bugs)
  6. Update process usually enhance the security by using good working software patches denying the hackers access to many of their harming tools

3-Piracy and Plagiarism

a.Piracyis getting illegal possession of intellectual property such as software and make counterfeit copies destined to be sold for very low price.

b.Plagiarism involves taking credit for someone else’s intellectual property, typically a written idea, by claiming it as your own.

4-Authentication:

  1. Authentication is the widely used technology to protect against intruders and attacks on information system and computer networks.
  2. There are 3 authentication approaches:
  3. Something you know (User ID and password)
  4. A password is a combination of characters known only to the user and used for authentication.
  5. A good password must have most of the following requirements:
  6. Strong by including words that are unrelated to your interests, and include upper and lowercase letters, numbers, and symbols
  7. Unique (same password must not be used in many different accounts).
  8. Changed regularly (at least once every 3 months)
  9. Something you have (badge, tag, etc…)
  10. Badges or ID card authentication, are something you may have to carry in order to be allowed access to many restricted access places
  11. Important high security areas include: Labs,computer systems, production plants premises and equipment
  12. Simulator and training equipment, etc...
  13. Biometrics- Something exclusively specific to you and unique about you:
  14. Biometricsauthentication technology consists of scanning and measuring a person’s unique physical features such as:
  15. Fingerprints: The least expensive biometrics authentication technology.
  16. Retinal patterns: map and measures all vessels on the back of the retina.
  17. Facial characteristics
  18. Facial and retinal are relatively expensive technologies

5-Interior threats:

a.Refers to risks and dangers from legitimate users that may harm business information system and its network resources; they include:

i.Threats to System Health and Stability

ii.Information theft

iii.Employees are very serious threat by negligence or infidelity.

b.Safeguards include the use of security and usage policies which defines acceptable and unacceptable uses of computer and network resources by business employees

c.Employers are not legally responsible for errant employees behavior when using business IT resources

d.When using these resources, employees must abide by ethic rules that are widely known to everyone.

6-War driving:

a.Driving through neighborhoods with a wireless notebook or handheld computer and looking for unsecured (unprotected) Wi-Fi networks.

b.Networks Use discovered access points for illegal Internet use and fraudulent Internet personal and business transactions.

7-Hacking and hackers:

a.A hacker is an extremely skilled programmer who may use his expertise to author software programs that overcome network security system and penetrate to these networks with or without user’s knowledge.

b.There are three types of hackers:

  1. Black hat hacker(criminal category of hackers):
  2. Shrewd and very skilled programmer who designs, implement and executes hacking schemes.
  3. Hacking scheme is a plan based on written hacking software that enables the hacker to access and possibly use his victim’s IT resources.
  4. This act is considered a felony that results in variable levels of damage that are usually very difficult to quantify.
  5. White hat hacker:
  6. Shrewd and very skilled programmer who makes legal money when hired by establishments to test the security of their networks and information system.
  7. His clients may include, without being limited to:
  8. Corporate and financial establishments.
  9. Government agencies and bodies.
  10. Security agencies from all over the world.
  11. Grey hat hackers: Suspicious white hat hackers

8-Hacking convention and conferences