CSCI5408 -Data   Management,  Warehousing- Assignment  #3 Solved

35.00 $

Category:
Click Category Button to View Your Next Assignment | Homework

You'll get a download link with a: zip solution files instantly, after Payment

Securely Powered by: Secure Checkout

Description

5/5 - (1 vote)

Objective:

·         The objective of this assignment is to understand industry problems related to data capture,

and database design. Create entity relationship model and perform normalization of the

database.

Assignment Rubric

  Excellent (25%) Proficient (15%) Marginal (5%) Unacceptable (0%) Problem #

where applied

Completeness

including

Citation

All required tasks

are completed

Submission

highlights tasks

completion.

However, missed

some tasks in

between, which

created a

disconnection

Some tasks are

completed, which

are disjoint in

nature.

Incorrect and

irrelevant

Problem #1
Correctness All parts of the

given tasks are

correct

Most of the given

tasks are correct

However, some

portions need minor

modifications

Most of the given

tasks are incorrect.

The submission

requires major

modifications.

Incorrect and

unacceptable

Problem #2
Novelty The submission

contains novel

The submission lacks

novel contributions.

The submission

does not contain

There is no novelty Problem #3

 

  contribution in key

segments, which is

a clear indication of

application

knowledge

There are some

evidences of novelty,

however, it is not

significant

novel contributions.

However, there is an

evidence of some

effort

   
Clarity The written or The written or The written or Failed to prove the Problem #1
  graphical

materials, and

graphical materials

and developed

graphical materials,

and developed

clarity. Need proper

background

 
  developed applications do not applications fail to knowledge to perform  
  applications

provide a clear

picture of the

concept, and

highlights the

clarity

show clear picture of

the concept. There is

room for

improvement

prove the clarity.

Background

knowledge is

needed

the tasks  

 

Citation:

McKinney, B. (2018). The impact of program -wide discussion board grading rubrics on students’ and faculty

satisfaction. Online Learning, 22(2), 289 -299.

Hypothetical Scenario

·         An established organization “HalifaxData5408” operates in Canada, and they have few clients

overseas.

·         Recently, “HalifaxData5408” signed a contract with the province “Nova Scotia”, and other

organizations for processing, enhancing, and storing their data.

·         You have joined “HalifaxData5408” as Information Specialist, and you will be in-charge of this

entire operation, which includes three problems or tasks. Since you are reporting to the

Manager, you need to document the entire o peration and provide justification for the choices

you make or decision you take.

 

Problem #1: Building a Data Model for Nova Scotia on its Provincial Parks

Visit the website https://parks.novascotia.ca/ and any other related websites that you find appropriate to gather information on Nova Scotia Parks. The province is trying to build an information system to capture all the key information related to the parks that are operating in the province. Your initial task is to identify the key entities and the relationships, so that at next phase of the project Nova Sc otia can decide on how to create the database.

Therefore, at this stage of the project, the province is expecting you to provide a correct and flexible data modelling, which is free from any of the design flaws (e.g., absence of capturing historical data, chasm trap, and fan-trap etc.)

Conditions/Steps You must Follow (Do not skip any point):

  1. This process does not require any web scrapping, therefore, do not perform such operations.
  2. You need to visit the website(s) and document your findings in a systemic manner.

o E.g., after visiting the website you find “Parks” an entity, then in a single sentence in the PDF file mention, why did you consider “Parks” as a valid entity. You should provide a tabular structure as mentioned in the 5th point.

 

  1. Identify at least 12 valid entities, and that does not include sub-types if you are considering an EER D.
  1. A valid entity means a proper strong or weak entity, which may have one or more attributes.

E.g., “ParkName ” is not a valid entity, it can be an attribute of entity “Parks”.

  1. Create a table of entities and provide the reason of your selection.
  2. Create an initial data modelling (Chen model) with entities you identified with the possible attributes and try to establish the relationships between the entities. You should also add cardinality at this stage. Perform this operation on a paper/ powerpoint/ word/paint etc. At

this stage you may get plenty of errors, design issues, and absence of attributes, or incorrect cardinalities, which are acceptable. This step will highlight your understanding of the problem, and the domain.

  1. In the next step, you need to perform a systematic approach to find solution for the design issues, or attributes that were not considered, or entities that you discovered new, and document it with possible solution. You need to write (within 1/2 page) the problems that you found in your paper (6th point) design and write your planning on how you are going to solve it.
  1. Once you find the solution, it is the time to build the final correct data modelling (ERD or EERD) using a tool like ErWin/ Visio / draw.io etc.
  • If you include EERD, then highlight the part in your ERD e.g., drawing acircle around the entity sets) that you want to extend.

Submission Expectations

  • Report in PDF ,
  • image of Initial ERD/EERD , and
  • image of final ERD/EERD .

Problem #2: Format Ocean Tracking Data and Report

Dalhousie Ocean Research wants you o xplore he dataset hey provided, n d perform the following:

  1. Read the document available at

http://oceantrackingnetwork.org/about/#oceanmonitoring

  1. Write a 1/2 page report (in your own words) on the different datasets, and attributes you discovered.
  1. Clean and transform the dataset using spreadsheet formula/filtering. You do not need to write any code or use any other tools.
  1. remove NULL values .
  2. rearrange the columns if needed.
  3. transform the data in a column or attribute if required to fit a common format.

 

  1. Is there a possibility of combining some of the tables or attributes without losing information (de-normalization)? If yes please perform the task and report your findings.
  1. Is there a possibility of decomposing some of the tables without losing information (normalization)? If yes please perform the task and report your findings.
  1. Based on the given dataset, create relational schema using MySQL DBMS
  2. Using MySQL Workbench and reverse engineering create the possible ERD. Your report must contain the ERD produced by he reverse engineering. In addition you need to add the cardinality.
  1. Populate the database with clean and transformed dataset (if dataset is huge, then import at least 1000 random data points or rows).

Submission Expectations

  • Report in PDF file,
  • ERD generated using MySQL Workbench,
  • Normalization/Denormalization (Logic and reason in the PDF file ),
  • SQLDump of Table structure and values (Before normalization or de-normalization)
  • SQLDump of Table structure and values (after normalization or de-normalization).

Problem #3: Opportunities in Halifax (Flexible project with abstract problem, and dataset)

A real-estate client wants to explore various factors that can create opportunities for future business investments in Halifax. The client wants to know f Halifax is good for Education, and if it has good shopping areas. They asked you to capture data from various public domains and create an enhanced

model. In addition they want a database populated with few hundred records just to analyze the data pattern. They are expecting an EERD that shows the possible relationships between various factors such as Education, and Shopping/Lifestyle.

To obtain information on various collected data, you can visit: https://data.novascotia.ca

Conditions/Steps You must Follow (Do not skip any point):

  1. This process does not require any web scrapping, therefore, do not perform such operations.
  2. You need to visit the website given and document your findings in a systemic manner.
  3. Identify at least 5 valid entities, and that does not include sub-types if you are considering an EERD.
  1. A valid entity means a proper strong or weak entity, which may have one or more attributes.

E.g., “SchoolNa me” is not a valid entity, it can be an attribute of entity “School”.

  1. Create an initial data modelling (Chen model) with entities you identified with the possible attributes and try to establish the relationships between the entities. You should also add cardinality at this stage. Perform this operation on a paper. At this stage you may get plenty of errors, design issues, and absence of attributes, or incorrect cardinalities, which are acceptable. This step will highlight your understanding of the problem, and the domain. (You do not need to solve design issues for this problem. However, it will be appreciated if you can solve the design issues)
  1. After downloading the required dataset, perform possible normalization/cleaning/de-normalization using spreadsheet formula/filtering and include the process in the PDF file with images.
  1. Import the clean dataset on MySQL Workbench and c eate the ERD using ever se engineering.

Submission Expectations

  • Report in PDF,
  • Image of ERD/EERD you created on paper.
  • Image of ERD obtained from MySQL Workbench.
  • SQLDump of Table structure and values.
  • ** ERD if you solve the design issues.

** Not a mandatory requirement.

  • Assignment3-wocdsj.zip