Affiliated with:

Foundations of Forensic Data Analysis

Foundations of Forensic Data Analysis

Data forensics is a broad term since data forensics encompasses identifying, preserving, recovering, analyzing, and presenting various results to support a variety of purposes

Forensic data analysis is the collection, modelling, and transformation of data to identify and highlight potential risk areas, detect non-standard or fraudulent activities that use data, and set up internal controls and processes to minimize a variety of risks.

Data forensics can also be used in instances involving the tracking of phone calls, texts, or emails traveling through a network as well as using structured data from databases. Digital forensics and data management professionals may use decryption, reverse engineering, advanced system searches, and other high-level analysis methods in their data forensics process.

investigators use data forensics for crimes including fraud, espionage, cyberstalking, data theft, violent crimes, and more. Computer forensic evidence is held to the same standards as physical evidence in court, which means that forensic data analysis must produce evidence that is authentic, admissible, and reliably obtained.

Additionally, forensic data analysis helps an organization make better business-critical decisions using the forensic techniques and tools to examine data, to identify unexplored growth areas and pinpoint risk using enterprise data.   Forensics of this nature presents an opportunity to improve a business’s efficiency and manage compliance more effectively especially by applying data mining techniques to results gathered through forensic data analysis.  The data mining can identify different perspectives and transform the data into information that can be used to increase revenue and reduce costs.

The Data Forensics Process

The data forensics process has 4 stages: acquisition, examination, analysis, and reporting.  In acquisition, analysts identify the types of data needed for a situation and review the sources where that data resides to acquire the relevant data from as many datasets as practical. Often, the data is gathered into a structure such as a data warehouse or data mart to enable examination and analysis.

In examination, experts use exploratory data analysis techniques to look at the characteristics of large data sets, often displayed using data visualization so patterns of activity can be identified more easily.  The analysis stage creates queries, processes the results, and reviews emerging patterns. Creating hypotheses using methods of explorative data analysis is part of the analysis stage, to simulate how actions may have caused the situation under analysis.  If no evidence is found, the hypothesis is scrapped and a new one developed. and the process would start over again.  These three stages are iterative, and often require multiple iterations before providing salient results.

Reporting can take many forms, including a written document, graphical presentations, dashboards, and other visualization / business intelligence techniques, etc.

The iterative processes associated with forensic data analysis can be time-consuming, and if left in the hands of individuals lacking significant experience, it can become expensive and frustrating. Most teams of forensic analysis experts include data engineers and forensic scientists who excel at performing technical steps, who understand processes and internal controls of the investigated company, and who are familiar with patterns of investigative data analysis.

Data Forensics Tools and Software

There are many different types of data forensics software that provide their own data forensics tools for recovering or extracting deleted data. There are also many open source and commercial data forensics tools for data forensic investigations. Security software such as endpoint detection and response and data loss prevention software typically provide monitoring and logging tools for data forensics as part of a broader data security solution.

Forensic data analysis tools include analytical data processing software to analyze data from multiple perspectives, categorize and examine the collected data to review suspected behavioral patterns, or to discover weaknesses in analyzed processes.  Many organizations use standard business intelligence and analytics software applications to support their forensic data analysis efforts.

Challenges Facing Forensic Data Analysis

There are a variety of technical, legal, and administrative challenges facing data forensics. Technical factors that affect forensic data analysis include encryption issues, need for large amounts of disk storage space for data collection and analysis, and anti-forensics methods (efforts to circumvent data forensics tools, by hardware, process or software).

Legal challenges can arise in forensic data analysis and can confuse or derail an investigation, such as attribution issues stemming from a malicious program capable of executing malicious activities without the user’s knowledge.  These applications can make it difficult to identify whether cybercrimes were deliberately committed by a user or if they were executed by malware. The complexities of cyber threats and attacks can create significant difficulties in accurately attributing malicious activity.

Administratively, the main challenge facing data forensics involves accepted standards and management of data forensic practices. Although many accepted standards for data forensics exist, there is a lack of standardization across and within organizations. Currently, there is no regulatory body that overlooks data forensic professionals to ensure they are competent and qualified and are following accepted standards of practice.

Proactive Forensic Data Analysis

The processes used in forensic data analysis present an opportunity for companies to move from a reactive risk mitigation / data security approach to a proactive style.  Active data analysis supports documenting lessons learned and examining information from advanced analytics tools to help improve systems, support additional process security, and enhance regulatory compliance. Adoption of a proactive approach to continuing forensic data analysis can have many benefits:

  • Improving corporate risk assessment efforts
  • Improving the speed of fraud detection through better training and awareness of potential risks and patterns of behavior
  • Increasing chances of detecting fraud and risk areas in large data sets
  • Being more responsive in fraud or other suspicious behavior investigations
  • Making internal corrective activities part of an organizational improvement process
  • Meeting compliance and regulatory expectations

The growth in machine learning algorithms and artificial intelligence (AI) in fraud detection software, along with the emergence of processes and specialists in forensic data analysis are making it more realistic for companies to invest in and to incorporate what was once a reactive, damage-limitation exercise into a continuing active process of improvement and risk identification and mitigation.


Although a relatively new field, the emergence of forensic data analysis and the use of analytics techniques and software to support investigation of large datasets for behavior patterns can provide many financial and organizational benefits to any enterprise.


Anne Marie Smith, Ph.D.

Anne Marie Smith, Ph.D. is an internationally recognized expert in the fields of enterprise data management, data governance, data strategy, enterprise data architecture and data warehousing. Dr. Smith is a consultant and educator with over 30 years' experience. Author of numerous articles and Fellow of the Insurance Data Management Association (FIDM), and a Fellow of the Institute for Information Management (IIM), Dr. Smith is also a well-known speaker in her areas of expertise at conferences and symposia.

© Since 1997 to the present – Enterprise Warehousing Solutions, Inc. (EWSolutions). All Rights Reserved

Subscribe To DMU

Be the first to hear about articles, tips, and opportunities for improving your data management career.