Tab 7 - Data Analytics and Information Systems - 2 Flipbook PDF

Tab 7 - Data Analytics and Information Systems - 2

23 downloads 108 Views 3MB Size

Recommend Stories


Information systems of Bancaja
Banca. Bancos. Majors holdings. Resources employed. Use of Internet. Intranet

Clean and care systems
QUALITY MADE IN GERMANY SYSTEMAS DE mantenimiento Y LIMPIEZA Clean and care systems North America: Central and South America www.bluechemusa.com w

Una de las soluciones. Big Data: Hadoop. Business Analytics. Big Data
M2M: ¿Hará el „Internet de las cosas“ que la humanidad pase a un segundo plano? ORACLE MAGAZINE PARA MEDIANAS EMPRESAS Business Analytics Acceso de

HYDRAULIC SYSTEMS SAC 2
www.NAPAonline.com HYDRAULIC SYSTEMS SAC 2 ALICATES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 - 21 ARCO DE

2 7 August ORIGINAL: ENGLISH and SPANISH
CBD Distr. GENERAL UNEP/CBD/BS/COP-MOP/7/INF/2 7 August 2014 ORIGINAL: ENGLISH and SPANISH CONFERENCE OF THE PARTIES TO THE CONVENTION ON BIOLOGICAL

Story Transcript

DATA ANALYTICS AND INFORMATION SYSTEMS

Table of Contents ................................................................................................................................. 1 ....................................................................................................................................... 6 ..................................................................................................................................... 8 .............................................................................................................................. 14 ..................................................................................................................................... 17 ............................................................................................................ 19 ......................................................................................................................................... 24 ............................................................................................. 32

1

An information system is a set of integrated components, interacting together to collect, store, process, and distribute information. Components include data, information, people, and information technology.

Data Data are facts that organizations collect, record, store, and process in an information system. Data can be: 

Qualitative – data that represents opinions, feelings or observations that cannot be represented by a numerical value.



Quantitative – data that can be easily measured or quantified objectively.

Qualitative and quantitative can be further broken down by attribute: Qualitative

Quantitative

Nominal Used to label or name variables with no specific order such as gender or colour.

Interval Numerical scales which can be ordered such as time and temperature scales. The differences between categories are of equal measurement.

Ordinal Values that have a meaningful sequence or ranking between them such as grades.

Ratio Numeric scales which can be ordered such as height and weight. The differences between categories are of equal measurement and have an absolute zero (can never be negative).

Data structure refers to how data is stored and organized. Data is either: 

Structured data – data that has a defined length, type, and format and includes numbers, letters, dates, or strings.



Unstructured – data that is not defined and does not follow a specified format. Examples include social media content, emails, podcasts, and transcripts.

2

Information Information is data that has been organized and processed to provide meaning. The activities that produce information within an information system include:

INPUT Capturing or collecting raw data.

OUTPUT

PROCESS Converting raw input into a meaningful format.

Processed information in the form of text, images, video or sound.

The value of information is directly related to what decision makers need in the decision-making process. Valuable information is: 

Accessible – easy to access



Accurate – free of errors



Complete – does not leave out anything that is important



Objective – without bias



Relevant – information should be relevant in both context and subject



Simple to understand



Timely – available when it is needed



Verifiable – the nature of the information is such that different people would produce the same result.

There is always a trade-off between the accuracy and timeliness of information. It takes time to increase the accuracy of financial information, but as it takes longer to get information, its value decreases.

3

People There are many users of information in an organization, and they include: 

Front-line workers – oversee day-to-day operations such as restocking inventory and determining credit limits offered to customers.



Management – make decisions, develop action plans, and solve problems by analyzing operational output.



Executives or senior management – concerned with the overall strategic direction of the organization such as developing a company’s business strategies and determining their overall goals.

Users utilize information systems to support the decisions they make based on their roles:

Strategic Decisions

Executive/senior management making unstructured decisions. Managerial Decisions Management making semi-structured decisions.

Operational Decisions Front-line workers making structured decisions.

Unstructured decisions require judgement and occurs in situations where no established rules or procedures exist. Semi-structured decisions have elements of both structured and unstructured decisions. Structured decisions are repetitive in nature and frequently made.

4

Information Technology Information technology (IT) is the hardware, software, and networks of computer systems that an organization needs to achieve its business objectives.

Hardware Hardware consists of the physical devices associated with a computer system and includes: 

Central processing unit (CPU) – the hardware that executes instructions and coordinates how all the hardware devices will work together.



Primary storage – the main memory of a computer consisting of random-access memory (RAM) and read-only memory (ROM).



Input devices – devices used for entering data such as a keyboard, mouse, scanner, and digital pens.



Output devices – a device that receives data from a computer such as a printer, monitor or speaker.



Secondary storage – a device used for storing information such as a hard disk drive, memory stick, USB drive, and DVDs.



Communication device – connects one IT device to another.

Software The two major types of software are systems software and application software. 

Systems software consists of: o The operating system (OS) – a set of computer programs that control the computer hardware and act as an interface with application programs. Examples include Microsoft Windows, Mac OS, Unix, and Linux. o Utility programs – are part of the OS that aids users in performing tasks to configure, optimize, or maintain a computer such as copying files and performing system diagnostics.



Application software – helps users perform specific tasks. Examples include spreadsheet software, database software and word processing software.

5

Networks Networks refer to the linking of two or more computers together to communicate and to share resources. A network is defined by ownership, geography, protocol, or topology. 

Ownership – networks can be either public or private. o Public – owned by a public network provider (for example Rogers, Bell, Teksavvy, Shaw) to supply infrastructure to other organizations or individuals. o Private – owned by a single entity for exclusive use by that entity.



Geography – networks confined to a geographical space. o Local Area Network (LAN) – designed to connect a group of computers near each other, such as in an office building, a school, or a home. o Wide Area Network (WAN) – spans a large geographic area, such as a province or country and often connects multiple smaller networks together. The world’s most popular WAN is the Internet.



Protocol – a set of rules that determine how data is transmitted between different devices.



Topology – the geometric arrangement of the actual physical organization of devices (nodes) in a network.

Basic Topologies Ring

Star

Line

Bus

Hybrid Topologies (Uses two or more basic topologies) Mesh

Tree

Fully Connected

6

Organizations require a variety of systems to meet their goals and objectives. Information systems are grouped into two major categories: 

Operations support systems



Management decision support systems

Operations Support Systems Operations support systems ae designed to handle the day-to-day operations of a business. 

Transaction Processing Systems (TPS) – workers, front line supervisors, clerks, etc., mainly use non-integrated TPSs to capture and process data for fundamental business transactions.



Enterprise Resource Planning Systems (ERP) – many companies have moved away from using a TPS to implementing an ERP. ERPs include modules or components that are integrated together using a common database that deal with all functions of a company, including: finance, manufacturing, marketing, human resources, and logistics.

Management Decision Support Systems Management decision support systems assist management by providing the necessary information to control the business and make decisions. There are three kinds of management support systems: Management Decision Support System Management Information System (MIS)

User

Used For

Management

Providing information about internal operations to facilitate decision-making, planning and analysis.

Decisions Support System (DIS)

Senior management

Solving problems that are unique, changing frequently, and non-routine by using inputs from TPS and MIS, along with external systems.

Executive Support System (ESS)

Executives

Solving non-specific, high-level unstructured problems at the executive level.

7

Additional systems that an organization may implement include: Type of System

Purpose

Accounting Information Systems (AIS)

An AIS supports all accounting functions and activities. This includes financial reporting, managerial accounting, and tax.

Supply-Chain Management (SCM)

A SCM supports the planning, execution, and control of all activities involved in raw material sourcing and procurement, conversion of raw materials to finished products, and the warehousing and delivery of finished products to customers.

Customer Relationship Management (CRM)

A CRM supports a company in all aspects of customer encounters, including marketing and advertising, sales, customer service after the sale, and programs to keep loyal customers.

8

Information systems do not stay static and as a company’s strategic and organizational goals change, information systems must evolve for a company to stay competitive. There are four reasons why an organization would initiate a systems development project:

Mandated Change

Necessary Change

Developing new systems or making changes to existing systems because of changes in legislation or regulations. For example, change to the payroll calculation tables every year.

Unavoidable changes in systems due to any issue not related to legislation or regulations. For example, upgrading to the latest version of an operating system as the current version is no longer supported.

CHANGE Efficiency Improvements

Effectiveness Improvements

Development projects that streamline processes and generally reduce the need for people. For example, taking a manual busines process and automating it.

Development that addresses the quality of information. If the quality of information can be improved, the company should be able to make more informed decisions.

9

Stakeholders A range of stakeholders including users, managers and business owners are involved in the planning, development, and implementation of an information system. Other stakeholders that may be involved include:

Stakeholder

Role

Business Analyst

Responsible for analyzing the existing and proposed systems using their knowledge of the business.

Chief Information Officer (CIO)

Responsible for all information systems and the IT strategy of an organization.

Database Administrator

Responsible for the design, implementation, and maintenance of an organizations database.

Network Administrator

Responsible for the day-to-day operations of an organization’s networks including installation, maintenance, security, etc.

Project Manager

Responsible for planning and executing an organizations projects, including establishing the project requirements and managing the budget.

Steering Committee

An advisory body that provides strategic oversight and guidance.

Subject Matter Expert (SME)

Provides information about a particular area or topic due to their expertise on the topic.

Systems Analyst

Acts as a coordinator, communicator, and architect who interacts with the various technical and non-technical participants.

10

Systems Development Life Cycle The systems development life cycle (SDLC) is a framework used to develop and implement an information system in an organization. The phases of the SDLC:

Planning

Requirements Maintenance

Analysis

Design

Implementation

Testing

Development

Planning During this phase, an organization creates a high-level plan of the project taking into consideration their overarching strategic goals. Part of the planning process is completion of a feasibility analysis to measure both the tangible and intangible benefits to ensure that the system, when completed, will add value to the organization.

Economic Feasibility Measures the costeffectiveness of a project.

Operational Feasibility Measures how well a solution meets the identified requirements.

Technical Feasibility Measures if there is a technical solution available.

Schedule Feasibility Measures the project time frame.

Legal Feasibility

Measures how well the solution can be implemented within existing legal obligations.

11

After completion of the feasibility analysis, if it is determined that the cost of the system will not exceed both the tangible and intangible benefits, then the project should not proceed. Part of the planning process involves determining if the organization will either build a customized solution or buy a pre-built solution. Factors to consider in this decision include:

Build Functionality

Integration

Cost

Speed

Buy

System is designed to meet an organization needs.

Pre-built solutions may not provide the exact features needed by users.

Difficult to integrate multiple custom-built solutions together.

Generally, purchased solutions can be integrated with other software.

More expensive than buying as the system is built from scratch.

Pre-built solutions, due to the principle of economies of scale, are designed to help many organizations so they are generally cheaper. However, any customization will increase the cost.

Entire solution is built from scratch, Everything is ready when purchased. so it takes longer to complete. This does not include any additional customizations that an organization may request from the vendor.

Requirements Analysis In the requirements analysis phase, end-user business requirements are determined. Functions and operations of the intended information system are carefully defined and documented.

Design The design phase, which determines how user business requirements will be met, is the overall plan or model for the intended system. Desired features and operations of the system are documented including items such as screen layouts and reports.

Development The organization transforms the detailed design into the actual system during the development phase.

12

Testing To ensure the system will function as intended, the company creates a testing environment to test for bugs, errors, and interoperability prior to implementation.

• Unit Testing - individual modules are tested for any potential errors in the code by the developers.

Software Testing

• Integration Testing - modules are grouped together and tested to expose any defects when integrated. • System Testing - the complete system is tested under normal and peak loads for quality, reliability and scalability.

Acceptance Testing

• Involves the actual users of the system to determine if the system meets their expectations, requirements and operates as intended.

Implementation Implementation involves the installation of the system and making it operational. Activities include equipment acquisition and installation, preparation of the site, installation of any operating software changes, training of users, and conversion. Conversion is the process of changing from the existing system to the new system. Four conversion strategies that an organization can employ include:

The existing system continues to be used at the same time as the new system is introduced. Both systems run in parallel for a predetermined amount of time until the new system can be trusted.

The new system is introduced in a single unit/location for a set period before it is installed in other parts of the organization.

13

The system is broken down into smaller functional components that are brought into operation one at a time.

The existing system is discarded, and the new system takes over all at once.

Maintenance After the new system is installed and conversion has been completed, the system will require ongoing routine maintenance to correct errors, improve processing, or to meet changing requirements for the duration of the life of the system.

14

Artificial Intelligence Artificial intelligence (AI) simulates human thinking and behavior, such as the ability to reason and learn, in other words, to mimic human intelligence. Examples of AI include machine learning, neural networks, and virtual reality. AI systems increase the speed and consistency of decision making, solve problems with incomplete information, and resolve complicated issues that cannot be solved by conventional computing. There are two categories of AI: 

Weak AI – Weak AI machines are able to make their own decisions based on reasoning and past sets of data. Most of the AI systems on the market today are weak AI such as Siri, customer support chatbots and online game playing.



Strong AI – Strong refers to the field of artificial intelligence that works toward providing brain like powers to AI machines; in effect, it works to make machines as intelligent as humans.

Machine Learning Machine Learning is the study of how computer programs can make predictions or improve their performance without explicit programming. A machine that can learn, like a human being, can recognize patterns in data, and change its behavior based on its recognition of patterns, experience, or prior learnings. With machine learning, machines are able to act without human programs detailing how to perform tasks. Examples of machine learning is Google search algorithms, speech recognition applications, and email spam filters.

Neural Networks Neural networks may also be referred to as an artificial neural network. It is a category of AI that attempts to emulate the way a human brain works. Neural networks can interpret sensory data to perform tasks, generate reports, and are able to adapt and improve in a similar way to how a human learns.

Virtual Reality Virtual Reality (VR) is a computer simulated environment that can be used to simulate a real world or imaginary world. VR creates a three-dimensional image or environment that can be

15

interacted with by an individual using special electronic equipment such as a helmet with a screen inside or gloves that are fitted with sensors. VR is the fastest growing area of AI.

Robotic Process Automation Robotics process automation (RPA) is a form of business process automation technology that streamlines operations and reduces costs. It is the automation of mundane rules-based business processes that frees up employees to concentrate on value-added tasks. Organizations configure software, or a robot, to automate processes that are routine, consistent, high-volume, prone to human error and require limited decision making.

Internet of Things The internet of things (IoT) is a network of interconnected internet-enabled devices or ‘things’ that use sensors to connect to the internet to send data, receive data or both. Examples of IoT include connected appliances, wearable health monitors, smart home security systems, autonomous farming equipment and wireless inventory trackers.

Blockchain A digital ledger consisting of blocks of data that maintain a permanent and tamper-proof record of transactional data. Blocks are linked together using cryptography and each block contains a cryptographic hash of the previous block, a timestamp and transaction data. The process is as follows: 

A transaction is initiated.



The transaction information is transmitted out to a network of peer-to-peer computers worldwide.



The network solves equations to confirm the validity of the transaction.



Once confirmed to be legitimate the transactions are grouped together into blocks.



Blocks are linked together creating a history of all transactions that are permanent.



Transaction is complete.

16

XML and XBRL eXtensible Markup Language (XML) is designed to make a document readable by both a human and a machine. 

Creates customized tags that facilitate the definition, transmission, and interpretation of data across distinct organizations.

XBRL is short for eXtensible Business Reporting Language. It is an XML-based markup language developed for financial reporting. 

Provides a standard to prepare, publish and exchange financial statements. The use of XBRL means that meaningful comparisons can now be made between financial statements if those financial statements are produced using XBRL.



Public companies in the United States listed with the U.S. Securities and Exchange Commission (SEC) are required to use XBRL when reporting their financial information.



The Canadian Securities Administrators (CSA), in which all Canadian publicly traded companies need to be registered, has a voluntary program to report financial information using XBRL.



Using XBRL internally provides multiple benefits for the organization as well: o o

o

Once the data has been entered the first time, the shell can be reused over and over for reporting. With XBRL, financial statements have accurate numbers, monthly/quarterly and yearly closing processes are shortened, and more time can be spent on analysis, forecasting, and modelling, instead of preparation. Costs are lower because reporting completed on a regular basis is quicker using XBRL. Once the shell has been created, report generation is relatively straightforward and can be automated.

17

Threats and Risk Exposure Enterprise risk is the risk of an event occurring that may reduce the likelihood that the organization will achieve its objectives. Risks specific to information systems relate to cybersecurity, fraud, system reliability, compliance with regulatory requirements, intellectual property, and the ethical use of information. The following are common risk management strategies:

AVOID

DIVERSIFY

SHARE

Redesign the process to avoid particular risks and thus reduce overall risk.

Spread the risk among numerous processes to reduce the overall risk of loss.

Distribute a portion of the risk through a contract with another party.

TRANSFER

CONTROL

Distribute all the risk through a contract with another party.

Design actvities to prevent, detect, or contain adverse events.

ACCEPT Tolerate risks when the cost of managing them is greater than the potential harm.

Ethical Use of Information Information systems raise many ethical concerns for both individuals and organizations around the creation, collection, duplication, distribution, and processing of information. The protection of an individual’s privacy is one of the largest ethical issues facing organizations today. Protecting confidential information is not only a business requirement but also a legal requirement. In Canada, two statutes govern the privacy of personal information: 

The Privacy Act which regulates government agencies



The Personal Information Protection and Electronic Documents Act (PIPEDA), which governs the collection and use of personal information by organizations.

Other areas of concern include: 

Copyright – the legal protection afforded to an expression of an idea such as book.

18



Counterfeit software – software that is manufactured to look like and sold as the real thing.



Digital rights management – a technological solution that allows publishers to control their digital media. Used to limit or prevent illegal copying and distribution.



Intellectual property – intangible creative work embodied in physical form and includes copyrights, trademarks, and patents.



Patent – an exclusive right granted by the government to an inventor, to make, use, and sell an invention.



Pirated software – the unauthorized use, duplication, distribution, or sale of copyrighted software.

Fraud Fraud is any intentional means an individual or organization uses to gain an unfair advantage over another individual or organization. The fraud triangle is a framework designed to explain the reasoning behind an individual’s willingness to commit fraud. Pressure An individual’s incentive or motivation to commit fraud. Three common types of pressure are financial, emotional and lifestyle. Pressure includes not being able to pay one’s debts, fear of loss of status, or a desire to meet financial or performance targets.

Opportunity Allows an individual to commit the fraud, conceal the fraud and convert the proceeds. Opportunities include lack of internal controls, excessive trust in key employees, inadequate staffing, and inadequate supervision.

Rationalization Allows an individual to justify their behaviour. Rationalization can take on many forms including “I was just borrowing the money”, “Everybody does it” and “I was only taking what is owed to me”.

19

Database Management Systems A database is an organized collection of electronically stored data. A database management system (DBMS) is software that helps manage the database. It acts as an interface between the database and the various application programs. Examples of popular DBMSs are MySQL and Microsoft Access.

Database

DBMS Application

Application

Application

Database Hierarchy A hierarchical database model organizes data into a traversed tree-like structure that starts with a character and progresses into a database.

Database

Customer Record 1 Customer Number

Customer Table

Sales Table

Customer Record 2

Customer Record 3

Customer Name

Customer Address

Inventory Table

Customer Birthday

20



Database – a set of interrelated tables which together forms a database.



Table – a collection of related records.



Record – a collection of related data fields. A record holds all the information about an entity in the file. For example, each customer would have a record and an individual customer’s record would hold all the information about that customer.



Field – a group of characters. For example, customer number and customer name.



Character (not shown) – may be a letter, number, or symbol and each character occupies a single position in a field. For example, each letter in a customer’s name is a character.

Relational Databases A relational database is a type of database that provides access to data with pre-defined relationships. Organizations use relational databases because the data is only stored once, complex queries can be created, security and access to the data is improved and they are naturally scalable and extensible. In a relational database, data is stored in tables and follows the data hierarchy format noted above. The example below shows two tables: a vendor table and an inventory table. Tables consist of columns and rows. Columns, also known as fields or attributes, represent characteristics of the entity being described. There are six attributes in the vendor table: vendor number, vendor name, address, city, province, and postal code. Vendor Table Vendor Number 123 125 Primary key

Vendor Name Raden Ltd. Stenner Company

Address 65 Brown Rd. 878 Risa Rd.

City Toronto Regina

Province ON SK

Fields

Record

Foreign key Inventory Table Item Number Description 2041 Stove 2058 Stove 3015 Fridge

Postal Code L6YT8T S4N9R3

Price 899.00 799.00 759.00

Vendor Number 123 189 125

21

Each row in a table (called a tuple) is a record and contains data about a specific occurrence of the type of entity represented by that table. For example, each row in the vendor table contains data about an individual vendor and each row in the inventory table contains data about an individual inventory item. A primary key is the attribute that uniquely identifies a specific row in a table. The primary key in the vendor table is the vendor number and the primary key in the inventory table is item number. A primary cannot be blank nor duplicated. For example, there would not be two individual vendors with the same vendor number. A foreign key is an attribute in a table that is the primary key in another table. Foreign keys link tables together.

The primary keys, vendor number in the vendor table and item number in the inventory table each have a key symbol next to them. The relationship between the two tables is implemented when the primary key, vendor number in the vendor table, is used as a foreign key in the inventory table, connecting the tables together.

NoSQL Databases A noSQL database is a non-relational database and does not organize data by means of tables. Common data structures used by noSQL databases include key value stores, graphs, and document data. Strengths of a noSQL database include the ability to efficiently process large amounts of data and deal with unstructured data.

22

Data Warehouses and Data Lakes Data lakes and data warehouses are both centralized repositories used for the storage of data from a variety of sources. Data Warehouse

Data Lake

Data structure that is stored

Structured processed data

All types of data (structured and unstructured) in its original form

Storage cost

Expensive for large data volumes

Relatively inexpensive for large data volumes

Agility

Less agile, fixed configuration

Highly agile, flexible configuration

Users

Business professionals

Data scientists/analysts

Organizations should strive to store and maintain high-quality data. Dirty data is data that contains errors or is flawed. Examples of dirty data include data this is duplicated, misleading, incorrect, non-formatted and inaccurate. Dark data stored by an organization is data that they can access but choose not to use. The reasons the data are not used include: 

Does not add value



Extracting the data is too complex



Organization lacks capacity to process the data



Uncertain of the integrity of the data

Data Extraction, Transformation and Loading Data Sources Data

Data

EXTRACT TRANSFORM

LOAD

Data Warehouse

23

Extraction, transformation, and loading (ETL) is a process that: 

Extracts data from various sources including internal and external databases.



Transforms data to a common set of enterprise definitions for querying. For example, data may be sorted or filtered to make it easier to analyze.



Loads data into a data warehouse.

Data Life Cycle The different steps in the life of a piece of data from initial capture to its archive or deletion.

Collect

Destroy

Store

Data Life Cycle Archive

Use

Share



Collect – data is acquired or captured.



Store – data, once captured, needs to be stored and protected.



Use – data is used to support activities in an organization (viewed, processed, modified, and saved).



Share – data is shared with internal and external stakeholders.



Archive – data is stored so that it may be used again.



Destroy – data is purged and deleted when it is no longer needed by the organization.

24

Big Data Big data is a collection of large complex data sets, including structured and unstructured data, which cannot be analyzed using standard relational databases. The 4 common characteristics of big data, also known as the 4 Vs of data, are volume, veracity, velocity, and variety. When combined, the result is the fifth V- Value.

Veracity

Volume Amount of available data.

The quality and accuracy of data.

Velocity

Variety

The speed in which data is converted into usable form.

All forms of unstructured and structured data.

Organizations use big data for such items as: 

To determine customer buying patterns by analyzing clickstream data



Sentiment analysis based on social media data



Fraud detection using forensic analyses



Machine-learning-based investment strategies



Enterprise risk management and to reduce uncertainty in decision-making



Product and service improvement

25

Pitfalls of big data include: 

Insufficient computer processing power



Physical storage issues



Integrating big data into an organizations existing enterprise systems



Overall cost

Data Analytics Data analytics is more focused than big data and is the process of evaluating data for the purpose of drawing conclusions to address business questions. Data analytics follows a step-by-step process to provide meaningful insights for decision making purposes.

• Define the question

Step 1

Step 2 • Collect and prepare the data

• Analyze the Data

Step 3

Step 4 • Visualize and share the data

Step 1 – Define the Question Define the objective and determine the question that needs to be addressed. Questions should be measurable, clear, and concise. For example: 

How can we boost customer retention?



Can we reduce staff without compromising quality?



What pricing should be used for our new product line?

26

Step 2 – Collect and Prepare the Data Create a strategy for collecting and aggregating the data the appropriate data to address the question. This step entails the extraction, transformation and loading process. Factors to consider during extraction: 

Data accessibility – is the data accessible?



Data source o Primary data – collected directly by the organization and can include surveys, focus groups, interviews, questionnaires, or direct observation. o Secondary data – existing information collected by others and can include government statistics, trade publications, market research reports or company websites.



Data type – is the data unstructured or structured? Qualitative or quantitative?



Data integrity – is the data accurate and valid?

Factors to consider during transformation: 

Validate that extracted data is complete and that data integrity was maintained during extraction. For example, comparing the number of records that were extracted to the number of records in the source database.



Clean or “scrub” the data – process of detecting and correcting data to ensure analysis will be completed using high-quality data (data that is valid, accurate, complete, consistent, uniform etc.). Cleaning includes: o Removing duplicates and unwanted data outliers o Correcting inconsistences across data and standardizing formats such as date formats

27

Step 3 – Analyze the Data Carry out the various analyses to obtain the insights to address the question. The four broad categories of analysis are: Prescriptive – What do I need to do?

Complexity

Prescriptive

Predictive

Predictive – What is likely to happen?

Diagnostic

Diagnostic – Why is it happening? Descriptive – What is happening?

FUTURE

PAST

Descriptive

Descriptive – Explains what happened by reviewing historical data to gain insight. Looks for patterns within historical data to try and find the reason for a particular outcome. Diagnostic – Determines and explains why it happened by reviewing past performance. Predictive – Predicts possible outcomes using statistical models and machine learning techniques. Prescriptive – Combines descriptive, diagnostic, and predictive analytics to provide a recommended course of action.

Step 4 – Visualize and Share the Data Data visualizations provide a graphical representation of the data results and should provide all the necessary information without additional explanation. Selecting the right visualization requires understanding the needs of decision makers and is tailored accordingly. Selecting the right chart depends on whether the data is qualitative or quantitative in nature.

28

Charts for Qualitative Data

Pie Chart 1st Qtr

2nd Qtr

3rd Qtr

4th Qtr

8%

Pie charts are sectioned into slices that each represent a group of data. The size of the slice shows the percentage of the segment comparted to the total.

10%

23%

59%

Column Chart 5 4 3

Column charts are used to compare different values among categories. Best used to visualize ordinal data.

2 1 0 USA

Canada Mexico Australia

Horizontal Bar Chart Sweden

Horizontal bar charts are best used to visualize nominal values. They are like column charts but are often used to show more categories than a typical column chart.

Germany US All Others 0

2

4

6

29

Stacked Column Chart 10 8

A stacked column charts combines the visual of a bar chart with that of a pie chart. They are used to communicate totals and trends.

6 4 2 0 Quebec

Ontario

Product 1

Product 2

Manitoba Product 3

Tree maps use size and colour to show proportional size of values.

Word Cloud

Word clouds are used for text data instead of categorical data and are formed by counting the frequency of each word mentioned in a data set. The higher the frequency of a word in proportion to the total, the larger the word in the word cloud.

30

Charts for Quantitative Data

Line Chart 5 4 3 2

Line charts are used to show data changes or trends over time.

1 0 January

February

Drink Sales

March Food Sales

Scatter Plot 3.5 3 2.5 2 1.5 1 0.5 0

Scatter plots identify the correlation between two variables.

0

1

2

3

Filled geographic maps are used to visualize the ranges of quantitative data across different geographic areas.

31

Box and whisker plots are used when summarizing a set of data that is measured on an interval scale.

Business Intelligence Business intelligence (BI) refers to the tools used to retrieve, analyze, and transform data. It encompasses online analytical processing, data mining, business performance management, predictive analytics, prescriptive analytics, descriptive analytics, big data, and benchmarking to help management analyze information to make business decisions. BI dashboards are data visualizations and analysis tools that track metrics, key performance indicators and other important data all on one screen or page.

32

Example 1 Paint them Blue for Autism (PBA) is a charitable organization whose mission is to support autistic individuals in the Greater Toronto Area (GTA). The organization runs three fundraising campaigns a year and the campaigns generate approximately $44,000 annually. Paint Them Blue is their longest running campaign. Donors raise funds by painting their nails or any object blue and posting a picture to social media tagging PBA. The Light Em Up campaign is run in conjunction with World Autism Day (April 2nd) where landmarks in and around the GTA are lit blue (for example, the CN Tower in Toronto). The third campaign, Blue Day was launched in 20X1. In this campaign, students in a limited number of GTA school communities and employees in corporations pay a nominal amount to dress casually in blue on a specified day each month. The board has come to you, CPA, to analyze the gift patterns over the past two years, including the number, types, and characteristics of donors. They would also like you to provide them with any recommendations based on your findings. The following visualizations have been provided to you by a member of your team to assist you in your analysis.

33

34

35

36

Solution: To: The Board From: CPA RE: Analysis of Donors and Campaigns Thank you for allowing me this opportunity to analyze your gift data. I have several observations that I would like to discuss. Your organization has three donor types: individuals, schools, and corporations, with the largest by volume being individuals. For the last two years, 4,513 donations were received from individuals, 505 from schools and 190 from corporations for a total of 5,208 donations. If we break down the donations further by year, it appears that donations from both schools and corporations have increased slightly in both number and dollar value as follows: 

Corporate donations increased from 83 to 90 and total dollars increased from $7,246 to $9,037.



School donations increased from 61 to 64 and total dollars increased from $3,590 to $4,211.

However, there seems to be a slight decrease in individual donations. The number of donations dropped from 2,209 to 2,128 and the dollar value from $32,702 to $31,541. Individual donations do make up the majority of all fundraising efforts accounting for 70% of the donations received in 20X2 ($31,541/$44,789). Profile of Donors The average age of individual donors is 42. Of the donors that provided their gender, 1,830 were female and 2,477 were males. Since it appears that the majority of the donors were male, it may be that males are more likely to donate than females to these campaigns. More research will need to be gathered to determine if this is the case. The results of this research may impact future marketing efforts. Campaign success The most successful campaign overall is your longest running campaign, Paint them Blue, making up 80.12% of all donations for the last two years. However there has been a slight decrease in donations in 20X2 when compared to 2021. Blue Day made up 11.79% of total donations received and Light Em Up, 8.09%. Both Blue Day and Light Em Up saw an increase in donations when broken down, year over year. To help increase donations across all campaigns, it is recommended that all donors be informed of all campaigns. Individual donors can be informed of the Blue Day and Light Em Up campaigns so that they may consider bringing this information to their current employer for consideration as a fundraising initiative.

37

Another possibility is to reach out directly to the School Boards so that they can provide the Blue Day campaign details to all schools within their boards. Private school registries can be cross referenced and private schools can also be contacted with the campaign details. This may help increase overall participation from more schools. Donors appear to be concentrated in Pickering, Markham, Toronto, Mississauga, Oakville, Hamilton, and Caledon. It is recommended that marketing to additional communities in the GTA may also assist in overall fundraising efforts. Finally, it is recommended that Paint them Blue for Autism considers approaching foundations and community groups. There may be additional sources of funds (donations and grants) that you may qualify for.

38

Example 2 Beauty Mark Inc. (Beauty) is a cosmetics retailer with 5 locations in Vancouver, British Columbia. You, CPA work as an analyst at Beauty and report to Sam Walters, the newly appointed CFO. Sam would like you to analyze the sales of each of the 5 stores for the past two years and determine if any of the locations should be closed. Based on the sales data and square footage for each store, you have prepared the following visualizations to assist you in your analysis.

39

Solution: To: Sam Walters From: CPA RE: Analysis of Sales for Beauty Mark Inc. As requested, I have completed the analysis for each of our 5 locations. Based on the sales data the English Bay location has the highest sales in both 20X1 and 20X2 and was the only store to see an increase in sales year over year. Sales at the Canada Place location are flat year over year, while our locations in Gastown, Granville Island, and Kitsilano Beach all saw a decline in sales in 20X2. The greatest decline, year over year, is at our Granville Island location. When reviewing the sales per square foot by store, it is noted that the Kitsilano Beach location is the best performer of the group. Granville Island is the worst performing store of all the locations on a sales per square foot basis, followed by the Canada Place location. Although sales at Kitsilano Beach have declined in 20X2, its sales per square foot is still the highest of all the stores, therefore it is performing well in comparison to the other locations. Further investigation is required to determine why sales have declined in 20X2. Granville Island has not only had the highest decline year over year but is the worst performer when comparing its sales per square foot to the other locations. This store location should not be closed based on the financial results alone, however additional qualitative factors need to be assessed before a final decision is made. Some qualitative factors to consider include: 

Does Beauty Mark Inc. want to maintain brand awareness in this area despite declining sales levels?



Has there has been a change in staff or management?



Has a competitor opened close to this location?



What is the population size of the surrounding area that the store services and if the store is selling the right mix of products for the customer base?

Please note that this list is not exhaustive and other qualitative considerations may be valid.

40

Example 3 Ceylon Co. (CC) manufactures industrial cylinders. You, CPA, are the Senior Accountant on the annual audit of CC for the year ended December 31, 2022. As the senior on the audit the first account you are to analyze is the allowance for doubtful accounts. CC has provided the audit team with an electronic copy of its accounts receivable logs to help to determine whether the allowance for doubtful accounts is reasonable. Items to note include: 

All of CC’s sales are on credit with a 30-day payment term.



CC’s policy is to record an allowance equal to 3% of outstanding receivable balances at year-end.



For the year end December 31, 2021, CC had outstanding receivables of $34,637 and recorded an allowance for doubtful accounts of $1,040. At December 31, 2022 $14,729 of those receivables was still outstanding.



At December 31, the balance in the allowance for doubtful accounts was $2,184.

The following visualizations, based on the dataset provided by CC, contain the detail to assist in the analysis of CC’s allowance for doubtful accounts. See excel file for the following*: 1. Annual sales analysis 2. Accounts receivable analysis for total accounts receivables at year end 3. Account receivables that are 1 day late or greater 4. Account receivables that are 365 days late or greater 5. Sales history for customer KX60, HP37, PP82 and FG62

* If you are using a hard copy binder, see the PASS private pages, in the Student Log-in, for the soft copy binder and excel file

41

Solution: To: Partner From: CPA, Audit Senior RE: Ceylon’s allowance for doubtful accounts balance The allowance for doubtful accounts balance at Dec 31, 2022, for CC was $2,184, which is 3% of the outstanding accounts receivable balance of $72,789. This is consistent with how the balance was previously calculated. Upon review of the accounts receivable log, there are concerns that the allowance is understated. In 2022, sales almost doubled from $195,333 to $370,575. The accounts receivable balance at the end of 2022 was $72,789, which is 20% of the current-year sales ($72,789/$370,575). The accounts receivable balance at the end of 2021 was $34,637, which is 18% of the 2021 sales ($34,637/195,333). Of the accounts receivable for 2022, only $11,453 ($72,789 - $61,336) are current (payment is not due yet) and the remaining $61,336 are all the invoices that are past due. The majority of customers exceed the 30-day payment terms with 20 out of 22 customers having a late invoice outstanding at year end. Further review revealed that four customers have an outstanding balance of $9,463 in total that are over a year old. This amount exceeds the current allowance for doubtful accounts of $2,184. A detailed review of customer accounts revealed:  Customer KX60 has never made a payment on account. Collection is highly unlikely. This represents $4,607 of the $9,463 of the receivables that are greater than a year old. 

Customer HP37 has approximately 25% of its sales outstanding with the majority of these from the past few months. Historically, it appears that collection was not an issue, however of the outstanding invoices, $4,359 are greater than a year old.



The amount outstanding for customer PP82 appears to be an anomaly as they have consistently paid their invoices.



Invoices from customer FG62 may be collectible considering 93% [1- ($3,485/$52,842)] of all invoices issued have been paid. Like customer HP37, the two outstanding invoices are from 2021, therefore they also should be investigated further.

In summary, as one customer (KX60) has never made a payment on account (value = $4,607) and given the fact that there are additional customers who have invoices greater than one year late, the current allowance for doubtful accounts of $2,184 is not sufficient. We will need to discuss with management the possibility of adjusting the account.

42

Example 4 Healing Practitioners (HP) is a distributor of supplies and products to nurse practitioner clinics located in Calgary, Alberta. Managing their sales tax has always been an issue for HP as some of the products they carry are zero-rated (products to which GST/HST applies at the rate of 0%). Total GST charged in sales has historically been in the range of 3% to 4% of total sales, however, over the past five years, GST remitted seems to be increasing. HP does not understand why the remittances have increased so much and is asking you, CPA, for your advice. HP invested in a new sales system five years ago, that is integrated with its accounting software. Each evening, the sales, inventory, cost of goods sold, and sales tax accounts are updated. Taxes for all products and services are assigned at the item level rather than at the sales level to ensure that zero-rated items are not taxed in error. The bookkeeper, when preparing the GST return, uses the balances in the GST collected account in the general ledger. The bookkeeper does not cross reference the GST collected account with the individual sales in the sales system that details the actual GST charged for the individual sale items.

43

HP would like you review its GST activity to understand why GST has been increasing over the past few years. You have created the following visualizations based on the GST data provided by HP, to assist you in your response.

44

45

Solution: To: Healing Practitioners From: CPA RE: Review of GST As requested, I have reviewed the information provided and have determined the following regarding the GST activity to understand why GST has been increasing over the past few years. I began my review of your sales for the past five years to validate your suspicions that GST remitted has been increasing. Your sales have been increasing year over year, however, when we review your GST collected as a percentage of sales, you are correct that the GST collected has been increasing and is greater than the 5% GST rate beginning in 20X6. Based on the inventory information provided, all zero-rated inventory items should have a GST rate of 0% and all other inventory items should have a GST rate of 5%. After reviewing the assigned GST rates of all inventory items, it is noted that 16 were assigned an incorrect rate. Ten inventory items were assigned a GST rate of 13% and 6 inventory items were assigned a GST rate of 15%. Total sales of the 16 products over the years is $7,665,405 and GST that was collected was $1,149,811. Assuming all these inventory items should have the 5% tax rate applied to them, $383,270 should have been collected resulting in a GST error of $766,541. This is a significant difference between the GST that should have been collected and remitted compared to what was calculated using the incorrect rates. It is recommended that the individual inventory records be updated immediately to reflect the correct 5% GST rate to ensure that all future amounts are correctly calculated.

46

Example 5 Your childhood friend, Andrew Umber, acquired 100% of Avios Inc. (Avios), a manufacturer of aviation model kits on June 1st. The controller of Avios has provided Andrew with the financial results for the first month following acquisition. Andrew has come to you, CPA, to review the reports and interpret the results.

Income Statement by Line Item 60,000 50,000 40,000 30,000 20,000 10,000 (10,000)

Sales

Cost of sales

Gross profit

Selling, general Depreciation & admin expenses

12-Month average

Interest

June 20X1

EXPENSES AS A % OF SALES 12-MONTH AVERAGE Interest Depreciation

Selling, general & admin expenses

Cost of sales

Net income

47

EXPENSES AS A % OF SALES JUNE 20X1 Interest Depreciation

Cost of sales

Selling, general & admin expenses

12-Month Trend in Sales & Cost of Sales 60,000 50,000 40,000 30,000 20,000 10,000 Jul-X0 Aug-X0 Sep-X0 Oct-X0 Nov-X0 Dec-X0 Jan-X1 Feb-X1 Mar-X1 Apr-X1 May-X1 Jun-X1 Sales

Cost of sales

48

Solution: Avios Financial Results – June 20X1 The first chart, ‘Income Statement by Line Item’, details sales and expenses for the month of June, relative to the average monthly sales and expenses over the past 12 months. Sales for the month of June were slightly higher compared to the monthly average. However, it appears that the June cost of sales and selling, general, and administrative (SGA) expenses were higher than the monthly average. This seems to be a significant increase in these expenses when compared to the small increase in sale relative to the average. As a result of the higher cost of sales and SGA expenses, net income was negative for the month of June, despite being positive on average. The two pie charts, ‘Expenses as a % of Sales 12-Month Average’ and ‘Expenses as a % of Sales June 20X1’ further highlight the increased expenses but detailed out as a percentage of total sales. It is evident that the cost of sales and SGA expenses make up a higher proportion of expenses as a percentage of sales. Depreciation and interest expense, which appear flat in the first chart, now make up a smaller proportion of expenses as a percentage of sales, relatively speaking. This further supports that there may have been an issue with cost of sales and SGA expenses. The final chart, ‘12-Month Trend in Sales & Cost of Sales’, details the trend in sales and cost of sales over the last 12 months. This provides a more granular breakdown of what makes up the average as shown in the previous charts. This chart shows a slight increase in sales over the last 12 months, while cost of sales have trended upward at a more significant rate. As a result, cost of sales has been approaching sales in terms of dollar value, narrowing the gross profit margin. This is a cause of concern because it indicates that the current month’s results may be suggestive of a trend and not the result of one month’s poor performance. Further investigation and additional information are required to determine the reason for the upward trend in cost of sales as it appears to be indicative of a larger trend with the potential to have long-term negative effects on profitability.

Get in touch

Social

© Copyright 2013 - 2024 MYDOKUMENT.COM - All rights reserved.