1. How many records does this file contain? How many fields are there per record?
It contains 7 records and there are 5 fields per record.
2. What problem would you encounter if you wanted to produce a listing by city? How might you alter the file structure in order to solve this problem?
The city is referenced within the complete address which makes it difficult to produce a listing when there are many records. In order to solve the problem one could split MANAGER_ADDRESS into 4 fields named MANAGER_ADDRESS, MANAGER_CITY, MANAGER_STATE and MANAGER_ZIP. Then there can be reports for each portion of the address.
3. How would you alter the file structure if you wanted to produce a listing of the file contents by last name, area code, city, state, or zip code?
Same as above. In order to produce a listing by first and last name separately one must break the field into 2 (PROJECT_MANAGER_FIRST AND PROJECT_MANAGER_LAST).
4. What data redundancies do you see in this file? How might these redundancies lead to anomalies?
The same name, phone and address are mentioned more than once creates a redundancy within the table and also when mentioned in more than one table. Redundancies can lead to anomalies which can cause an error message when trying to view the document. Assume one makes a change to the last name of Angela Dunstable because she got married but forgot to change it in all fields and tables; this could cause many problems within the company itself and the database.
5. Identify and discuss the data redundancy problems in the file structure shown above.
There are a lot of redundancies in this file such as the same PROJ_NUM's. There needs to be a 2nd table with a many-to-many relationship between PROJ_NUM and EMP_NAME to support the redundancies in this file.
6. What changes would you recommend to the contents of EMP_NAME and EMP_PHONE?
I would recommend changing EMP_NAME to EMP_NAME_F and EMP_NAME_L in order to have the first and last name in separate fields. That will produce a listing for either of the two names for each employee. I don't see a need to change the format of the EMP_PHONE contents unless there is a need to produce a listing for the area code specifically. In that case it could be split into two fields as EMP_AREA_CODE and EMP_PHONE.
7. Identify the various data sources in the file you examined in problem 5.
The data sources are the raw data or information within the file. In this case, I examined similarities in the PROJ_NUM, PROJ_NAME, EMP_NUM, EMP_NAME, EMP_CODE, and EMP_PHONE.
8. Given your answer to problem 7, what new files should you create to help eliminate the data redundancies found in this file shown above?
There should be a separate table for each of the 3 projects and a 1-to-many relationship from the PROJECT to the EMPLOYEE. The job codes also prove to be redundant and should have a separate table with a 1-to-many relationship between JOB_CODE and EMPLOYEE.