UFCFFY-15-M Cyber Security Analytics

Practical Lab: Yet More Network Traffic Analysis


You have been asked to examine a sample of network traffic to investigate suspicious activity on some of the company workstations. The company directors need to be able to understand this data.

  • Can you analyse the packet capture (PCAP) file provided and produce useful visualisation outputs (e.g., node-link diagram, parallel coordinates) based on this, that can help further explain the observed activity?
In [2]:
### Load in the libraries and the data
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

def load_csv_data():
    data = pd.read_csv('./example_data/example_pcap.csv')
    return data

data = load_csv_data()
data
Out[2]:
No. Time Source Destination Protocol Length Info
0 1 01:05:49.468757 172.16.1.4 172.16.1.255 BROWSER 243 Host Announcement CARLFORCE-DC1, Workstation, ...
1 2 01:05:50.279222 172.16.1.4 172.16.1.255 BROWSER 243 Host Announcement CARLFORCE-DC1, Workstation, ...
2 3 01:06:10.328524 172.16.1.201 224.0.0.252 LLMNR 66 Standard query 0x229b A isatap
3 4 01:06:10.390913 172.16.1.201 172.16.1.4 DNS 76 Standard query 0x6ef6 A www.msftncsi.com
4 5 01:06:10.391325 172.16.1.201 172.16.1.4 DNS 76 Standard query 0x6ef6 A www.msftncsi.com
... ... ... ... ... ... ... ...
8154 8155 01:43:36.828784 172.16.1.141 174.127.99.158 TCP 66 [TCP Retransmission] 49211 > 2017 [SYN] Seq=...
8155 8156 01:43:36.946258 174.127.99.158 172.16.1.141 TCP 54 2017 > 49211 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0
8156 8157 01:43:37.452810 172.16.1.141 174.127.99.158 TCP 62 [TCP Retransmission] 49211 > 2017 [SYN] Seq=...
8157 8158 01:43:37.563033 174.127.99.158 172.16.1.141 TCP 54 2017 > 49211 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0
8158 8159 01:43:38.578617 172.16.1.141 174.127.99.158 TCP 66 49212 > 2017 [SYN] Seq=0 Win=8192 Len=0 MSS=...

8159 rows × 7 columns

In [80]:
# How may you create a network graph / node link graph using Python?

import numpy as np

srcs = data['Source']
dsts = data['Destination']
nodes = np.concatenate((srcs, dsts))
nodes = np.unique(nodes)

import networkx as nx
G = nx.Graph()
G.add_nodes_from(nodes)

for i in range(data.shape[0]):
    src = data.iloc[i]['Source']
    dst = data.iloc[i]['Destination']
    G.add_edge(src, dst)
    
import matplotlib.pyplot as plt
plt.figure(figsize=(20,10))
nx.draw(G, with_labels=True)
In [7]:
# How could you create a parallel coordinates chart using Python?

import pandas
import matplotlib.pyplot as plt
import seaborn as sns
from pandas.plotting import parallel_coordinates

protocol_list = ['DNS', 'HTTP', 'SMB2']
 
pc_data = data[['Source', 'Destination', 'Protocol', 'Length']]
pc_data = pc_data.astype(str)
pc_data = pc_data[pc_data['Protocol'].isin(protocol_list)]
 
plt.figure(figsize=(20,10))
parallel_coordinates(pc_data, 'Protocol', color=('#1b9e77','#d95f02','#7570b3'))
plt.show()
In [ ]: