01 INTRODUCTION TO CYBER SECURITY DATA ANALYTICS

The modern world is dependent on information systems, whether it be for conducting business, managing large-scale national infrastructure, or be it for socialising and leisure activities as part of society. Underpinning these systems is data, which as an information asset is extremely valuable both in our professional and personal lives. Businesses and individuals alike can become vulnerable to threats that undermine the confidentiality, integrity or accessibility of vital information assets. The consequences of such incidents can be significant to organisations, leading to loss of reputation, damage to assets, regulatory fines or result in physical injury. More details…

If cyber security is how individuals and organisations reduce the risk of cyber attack, then Cyber Security Analytics is concerned with how we translate this concept into something actionable, through data science. Security is all about data. When we seek to detect cyber threats, we are analysing data in the form of files, logs, network packets and other artifacts. (See Why Data Science Matters for Security in Malware Data Science). Furthermore, the threat landscape has dramatically increased, and therefore hand-crafted manual processes alone for analysing malicious activities will not suffice. The combination of automated tooling for data gathering and analysis, coupled with human reasoning and understanding, makes for a powerful team. Having effective processes - such as gathering, storing, transforming, analysing and visualising data, as typically performed in data science - are crucial then for identifying and understanding potential risks, and fundamentally for responding and mitigating the impact on organisations and individual. Cyber Security Analytics is about well-informed decision making and improving our cyber situational awareness.

In this section, we will cover:

  • What do we mean by cyber security data analytics?

  • Data science and machine learning in cyber security.

What do we mean by cyber security data analytics?

For an organisation to be secure, a clear understanding of the operational environment is required. This is often described as situational awareness, which is the perception of environmental elements and events, the comprehension of their meaning, and the projection of their future states. Increasingly, organisations are deploying security operations centres (SOC), where analysts will seek to identify suspicious behaviour and understand the context and relevance to the organisational mission, often using Security Information and Event Management (SIEM) systems.

Technology underpins modern organisations and having insight into the business operational environment is crucial to protect it. As a first stage, ensuring the safe and correct operation of our computer systems, and our networking infrastructure is a good place to start. Network traffic data (e.g., packet captures) can help to indicate what data has been communicated over a network, and what actions have been carried out as a result of this (e.g., access to a particular URL, or downloading of large files). Intrusion Detection Systems (IDS) are commonly used to inspect networking inbound and outbound network traffic, to identify suspicious activities. IDSs will generate log files, and these logs constitute another informative data attribute. Similarly, firewall rules can help understand how the network is configured, and Intrusion Prevention Systems (IPS) will make decisions and act on IDS activity to prevent potential harm.

The remit of cyber security is far and wide and goes beyond traditional computers and network security. A holistic view is required of what we want to protect, and what attack vectors may be used to gain access. Therefore, aspects such as physical security, people security, and process security also need to be understood. Physical security may require CCTV, IoT sensor monitoring or GPS tracking. People security may require text analytics of social media and email usage. Process security may require analysis of business process models, supply chain security, organisational hierarchy information, and operational practice. Technology continues to influence how we conduct business across the global, and therefore we need to ensure that we understand our threat landscape and have clear monitoring in place to understand potential harms. In many cases, we are interested in spatial-temporal data, i.e., in what location did the activity occur and at what time? Given our highly connected society, location is becoming increasingly challenging (are we looking at the location of the attacker, the location of the data, the location of the breached system?), and as for time, devices are logging activities faster than we can humanly inspect them. There is then the need for big data cyber security analytics – to make this flow of data manageable and insightful, to highlight key attributes in the data, and to enable informed decisions to be made to respond and react to potential threats.

Security is about understanding systems, the people, and the processes that act upon these systems, such that they remain secure. Can we ever be fully secure? Probably not, but with greater insight of observed activities, we can manage this more effectively. Data analytics and visualisation techniques are one step towards achieving this.

Alt text

Security Data Visualisation Skills

Data science and security analytics requires a blend of skills that combines the ability to manipulate data, the understanding of statistical techniques, and the domain knowledge of what information is relevant and important for the purpose of informing a security investigation.

  • Cyber Security Substantive Expertise: This is the security domain knowledge, which will enable the security practitioner to understand the data, determine what is expected and find anomalies or metrics from visualization.

  • Data Hacking Skills: Data hacking skills are the skills from a data scientist language required for working with massive amount of data that should be acquired, cleaned and sanitized.

  • Math & Statistics Knowledge: This knowledge is critical to understand which tools to use, understand the spread and other characteristics to derive insight from the data.

AI and Cyber Security

DarkTrace, CheckPoint, Symantec, Sophos, FireEye, Cynet, Fortinet, Vectra, and Cylance. These are just a handful of vendors that now use Artificial Intelligence and Machine Learning as a part of their products and services for cyber security defence – there are plenty others too, but this just gives you an impression of the direction that the industry is moving in.

Cyber security requires a holistic view to identify what should be protected, and how may it be vulnerable to attack. The volume of data generated by today’s systems means that humans cannot analyse this raw data effectively. With AI and machine learning techniques, we can filter and manage data observations, whilst visualisation can help human analysts understand and communicate about observations and appropriate responsive actions.

What to expect from the Cyber Security Analytics module?

In many organisations, the role of a cyber security analyst is to help protect an organisation by deploying a range of techniques and processes that can help to prevent, detect and manage cyber threats. Such threats may relate to network-based attacks and malware attacks, where an external actor is attempting to gain access to confidential information, or conducting a denial of service attack to compromise availability of services. Other threats may include insider threats, including the leaking of company secrets, data exfiltration, and accidental or social engineering attacks. This module will study the role of cyber security analytics, covering the technologies used in industry, the role of data analytics, statistics and machine learning to support analytical reasoning, and the domain-specific attributes relating to networking, malware and insider threat analysis, including the development of analytical testing environments and the use of industry-based tools to examine these threats.

Educational Aims

This module will help students to understand the role of data analytics in the context of cyber security. It will address the common job role of cyber security analyst, often working security operations environments, and focus on the tasks and practices required for to examine, identify, assess and mitigate against active cyber threats. The module will introduce tools and techniques to support this function, including Security Incident and Event Management (SIEM) systems, and practical programming tasks for examining common threats in data streams such as networking traffic and computer activity.

Outline Syllabus

Purpose of a cyber security analyst, and common tools used

  • Python, Bash, Linux, Wireshark, tshark, tcpdump

Security Information and Event Management (SIEM) and Security Orchestration and Automated Response (SOAR) tools

  • Splunk, ELK, SecurityOnion

Intrusion detection and Intrusion Prevention

  • Snort, Suricata

Development of virtualised simulation environments for generating and analysing data

  • VMware, Scapy, custom scripting

Big data analytics for cyber security, machine learning concepts

  • Unsupervised, supervised

Use cases of cyber security analytics

  • Network traffic analysis, malware analysis, user behavioural analysis, linguistic analysis

Assessment Strategy

Students will produce a practical portfolio of work as their final module assessment, worth 100% of the module. Portfolio tasks will build cumulatively, starting with initial design and configuration of an experimentation testbed, integration with a SIEM, generation and analysis of sample data, and investigation and response to some security incident. Students will learn the end-to-end process of data generation, collection, aggregation, analysis, and response, using a practical experimental testbed that they have designed and developed. As part of the portfolio task, students will be expected to demonstrate the functionality of their system through a narrated video demonstration.

Module Preparation

W3 Schools provide an excellent primer to the many different domains of Cyber security. The course will address four key areas related to this module: Cyber Security, Networking, Cyber Attacks and Cyber Defence. You should complete this tutorial so that you have a fundamental understanding of these introductory topic areas, which we serve as a starting point to delve deeper during this module. More details at W3 Schools…