Task 1 – Applied Theory
You are required to refer to Read the case study referenced below:
Police Department using Predictive Analytics to foresee and Fight Crime
Refer to your textbook by: Sharda, R., et al (2018) Business Intelligence, Analytics and Data Science, A managerial perspective, 4th Edition, Pearson education. Chapter 4, ‘Data Mining Process, Methods and Algorithms’, pp:190-193 (Opening Vignette 4.1).
In 1500 words (+-50) and excluding list of references, write a report which attempt to answer questions below, using the case study as a guide:
- Why do law enforcement agencies and departments embrace advanced analytics and data mining?[4]
- Explain the main challenges of the police department? Identify other challenges not mentioned in the case study that can benefit from data mining? [6]
- What results were obtained? [4]
- What are sources of data that law enforcement agencies and departments use for their predictive modelling and data mining projects? [4]
- What type of analytics do law enforcement agencies and departments use to fight crime?[2]
- What does the big picture start small mean in this case study? [2]
- Describe how the following perspectives of balanced scorecard (BSC) can help law enforcement agencies and police department to attain successful improvements:
- Business Process [3]
- Learning and Growth [3]
- Recommend appropriate solutions for the police department based on the following:
- Data warehousing and architecture [3]
- Data mining [3]
- Text & web analytics [3]
- Big data analytics [3]
Total Marks [40]
Task 2 – Using dataset and description below, answer the questions that follow:
You will upload onto Power BI (Desktop and/or Service), a dataset named ‘Census-Income (KDD) Data Set’ created by “Terran Lane and Ronny Kohavi, Data Mining and Visualization, Silicon Graphics. terran ‘@’ ecn.purdue.edu, ronnyk ‘@’ sgi.com” available at https://archive.ics.uci.edu/ml/datasets/Census-Income+%28KDD%29.
IMPORTANT NOTE: UCI website has useful information about the Census-Income (KDD) Data Set’, therefore it is very important to study the metadata before uploading the dataset into Microsoft BI.
Data Set Characteristics: | Multivariate | Number of Instances: | 299285 | Area: | Social |
Attribute Characteristics: | Categorical, Integer | Number of Attributes: | 40 | Date Donated | 2000-03-07 |
Associated Tasks: | Classification | Missing Values? | Yes | Number of Web Hits: | 209458 |
Table 1: Census Income Data Set [Source: ISCTE-IUL, 2016]
This data set contains weighted census data extracted from the 1994 and 1995 Current Population Surveys conducted by the U.S. Census Bureau. The data contains 41 demographic and employment related variables.
The instance weight indicates the number of people in the population that each record represents due to stratified sampling. To do real analysis and derive conclusions, this field must be used. This attribute should *not* be used in the classifiers. One instance per line with comma delimited fields. There are 199523 instances in the data file and 99762 in the test file.
Given the description of the dataset, as shown in table 1, you have been tasked with preparing data and curry out an assessment analysis report or dashboard on PowerBI.
Using a minimum of 1,000 words and excluding list of references, explain your work, interpret the results and reflect on your experiences:
- Explain how you built your Power BI report service (Microsoft, 2019) and the issues you faced. In particular, how you achieved the following:
- Given that table 1, shows that “Missing Data is Yes” what would you recommend for checking the quality of the data [4]
- Explain how to upload/retrieve dataset onto Power BI service [6],
- built your report/dashboard [8], and
- shared your report/dashboard with tutor and lecturer [2]
- Interpret the results of running your report/dashboard, using any four (4) suitable graphs that are interlinked, [12] (support your interpretation with visual evidence).
- Reflect on lessons learned, citing any noticeable trends from your findings [8] (support your answer with visual insights).
Task 2 Resources/List of references:
- Microsoft (2019) ‘From Excel workbook to stunning report in the Power BI service’. Available at: https://docs.microsoft.com/en-us/power-bi/service-from-excel-to-stunning-report (Accessed on 6th March 2020)
- Data set description from UCI Machine Learning Repository, available at: https://archive.ics.uci.edu/ml/datasets/Census-Income+%28KDD%29 (Accessed: 14th March 2022)
- PowerBI Tutorial Reference [online]: Power BI Tutorial – A Complete Guide on Introduction to Power BI. Available at: https://data-flair.training/blogs/power-bi-tutorial/ (Accessed: 17th March 2022)
Referencing (Tasks 1 & 2 Only)
Proper use of reference material, use of in-text referencing, grammatically correct report content, report format and content layout. [8] Total [50 marks]
Task 3 – Presentation of Report/Dashboard
Prepare an 8-12 slides presentation of that explains, interprets and reflects on the dataset and highlights key aspects of your report (refer to marking guide).
Presentation ‘in PowerPoint’ that:
- Prepare all the required screenshots.
- Justify the design of your proposed data visualization.
- Presentation Skills (see marking guide) Total [10 marks]
Assignment Marks Breakdown and Marking Template
Marking Criteria
Description | Possible marks | Awarded Marks |
Task 1: Applied Theory
· Why/Need for advanced analytics and data mining · Challenges, [1 mark per challenge, 6 expected]. · Results Obtained, [1 mark per result, 4 expected]. · Data Sources [1 mark each, 4 expected]. · Types of analytics to combat crime, [2 expected 1 mark each] · Big Picture meaning [2] · BSC Perspectives: [valid objective [1], measure [1], target [1] o Business Process o Learning and Growth · Recommendations [1 mark, 2 max]and justification [1 mark] o Data Warehouse architecture o Data Mining o Text & Web Analytics o Big Data Analytics |
(40)
4 6 4 4 2 2
3 3
3 3 3 3
|
|
Task 2: Visualization using Power BI Report
· Explain Data o Data preparation [1 mark for each valid step, 4 expected] o Upload/Retrieve dataset [1 mark for each valid step, 4 expected] o Build Report [2 marks for each graph – max of 4 expected, 1 mark for linkage, max of 4 expected] o Share Report [2 – actual submission of this task] · Interpret the results (support your interpretation with visual evidence) o Use four (4) ideal and inter-related diagrams. [4] o What information does these diagrams provide?[2] o What clues about the illustration can you gain from the general description? [2] o Can you define or describe the items labeled? [2] o Are there arrows, numbers or letters that orient the illustration? [2] · Reflection o Do you notice any trends in data? [4 expected, max-4 marks] o What conclusions can you reach about relationships among items on a diagram?[4 expected, max-4 marks] o Hint: Base your reflections on insights that are autogenerated from your visuals.
· Referencing & Document Structure o Referencing: Harvard referencing o Report layout: grammatically correct report content, report format, content layout and Neatness of work |
(50)
4 4
12
2
12
8
8 |
|
TASK 3: Presentation (mandatory task)
· Presentation Content visual appeal (screenshots, design, skills, readability) [2] · Ability to explain visuals clearly [2] · Present within 10 mins [2] and · Answer follow up questions correctly [4] |
(10)
|
|
Total Marks | 100 |
What to deliver:
- A word-processed report containing tasks 1 and 2 as stated above, to be submitted via Turnitin account.