UFCFFY-15-M Cyber Security Analytics¶

Practical Lab 7: Working with Splunk and Python¶


In this lab, we will explore how we can integrate Splunk and Python. Whilst Splunk provides sophisticated means of analysis through the Splunk Query Language, the ability to integrate Python allows for even greater control of how we analyse and process our data. We can then make use of the extensive data science libraries in Python to perform further analytics.

Useful links:

  • https://dev.splunk.com/enterprise/docs/devtools/python/sdk-python/
  • https://github.com/splunk/splunk-sdk-python
  • https://docs.splunk.com/DocumentationStatic/PythonSDK/1.6.13/client.html

General Guidance to Setup with DetectionLab instance

You can use the following examples to link up your UWEcyber VM with your DetectionLab logger machine that has an instance of Splunk running on it. This may be useful if you wish to obtain data from your DetectionLab to explore further in Python and Pandas.

  • First, make sure you have the Splunk SDK installed in your UWEcyber VM: python3 -m pip install splunk-sdk
  • Assuming that your DetectionLab is installed and running, you should be able to navigate to Splunk using a web browser - check that you can access Splunk at https://192.168.56.105:8000
  • From the DetectionLab/Vagrant directory, type: vagrant ssh logger to log into the logger machine via ssh.
  • Navigate to /opt/splunk/etc/system/local and type sudo nano server.conf
  • Add the line allowRemoteLogin = always under the [general] section of the configuration file. Save the file with Ctrl+S and exit with Ctrl+X.
  • Navigate to /opt/splunk/bin and type sudo ./splunk restart to restart Splunk. If asked for a password for user vagrant, type vagrant.
  • Type exit to exit the ssh session.

Basics of cURL usage for Splunk

https://docs.splunk.com/Documentation/SplunkCloud/8.2.2112/RESTTUT/RESTsearches

  • Try the following command curl -u admin:changeme -k https://192.168.56.105:8089/services/search/jobs -d search="search *". If the above is configured correctly, then your default credentials will allow you to get a XML response that contains a search ID. You can then follow this up for more information.
In [26]:
import splunklib.client as client

HOST = "192.168.56.105"
PORT = 8089
USERNAME = "admin"
PASSWORD = "changeme"

# Create a Service instance and log in 
service = client.connect(
    host=HOST,
    port=PORT,
    username=USERNAME,
    password=PASSWORD)

# Print installed apps to the console to verify login
for app in service.apps:
    print (app.name)
alert_logevent
alert_webhook
appsbrowser
force_directed_viz
introspection_generator_addon
journald_input
launcher
learned
legacy
link_analysis_app
lookup_editor
punchcard_app
python_upgrade_readiness_app
sample_app
sankey_diagram_app
search
splunk-dashboard-studio
splunk_archiver
splunk_essentials_8_2
splunk_gdi
splunk_httpinput
splunk_instrumentation
splunk_internal_metrics
splunk_metrics_workspace
splunk_monitoring_console
splunk_rapid_diag
splunk_secure_gateway
Splunk_TA_bro
Splunk_TA_windows
SplunkForwarder
SplunkLightForwarder
TA-asngen
TA-microsoft-sysmon
ThreatHunting
In [27]:
import pandas as pd

kwargs_oneshot = {"earliest_time": "2021-10-1T12:00:00.00", "latest_time": "2022-10-1T12:00:00.00"}

searchquery_oneshot = "| tstats count WHERE index=* by index"
oneshotsearch_results = service.jobs.oneshot(searchquery_oneshot, **kwargs_oneshot)

def splunk_to_pandas(reader):
    d = []
    for item in reader:
        entry = {}
        for i in item:
            entry[i] = item[i]
        d.append(entry)
    d = pd.DataFrame(d)
    return d

reader = results.ResultsReader(oneshotsearch_results)
d = splunk_to_pandas(reader)
d
Out[27]:
index count
0 osquery 2995
1 osquery-status 8
2 suricata 46
3 zeek 7096
In [28]:
import splunklib.results as results

kwargs_oneshot = {"earliest_time": "2021-10-1T12:00:00.00", "latest_time": "2022-10-1T12:00:00.00"}
searchquery_oneshot = "search index=suricata | head 10"

oneshotsearch_results = service.jobs.oneshot(searchquery_oneshot, **kwargs_oneshot)

reader = results.ResultsReader(oneshotsearch_results)
d = splunk_to_pandas(reader)
d
Out[28]:
_bkt _cd _indextime _raw _serial _si _sourcetype _subsecond _time host index linecount source sourcetype splunk_server
0 suricata~0~E9D1DED8-2B5E-4CE4-B073-6EDDE62A2180 0:2622 1647389357 {"timestamp":"2022-03-16T00:09:14.158750+0000"... 0 [logger, suricata] suricata:json .158750 2022-03-16T00:09:14.158+00:00 logger suricata 1 /var/log/suricata/eve.json suricata:json logger
1 suricata~0~E9D1DED8-2B5E-4CE4-B073-6EDDE62A2180 0:2602 1647389354 {"timestamp":"2022-03-16T00:09:14.116753+0000"... 1 [logger, suricata] suricata:json .116753 2022-03-16T00:09:14.116+00:00 logger suricata 1 /var/log/suricata/eve.json suricata:json logger
2 suricata~0~E9D1DED8-2B5E-4CE4-B073-6EDDE62A2180 0:2570 1647387852 {"timestamp":"2022-03-15T23:44:12.091112+0000"... 2 [logger, suricata] suricata:json .091112 2022-03-15T23:44:12.091+00:00 logger suricata 1 /var/log/suricata/eve.json suricata:json logger
3 suricata~0~E9D1DED8-2B5E-4CE4-B073-6EDDE62A2180 0:2400 1647387297 {"timestamp":"2022-03-15T23:32:44.108479+0000"... 3 [logger, suricata] suricata:json .108479 2022-03-15T23:32:44.108+00:00 logger suricata 1 /var/log/suricata/eve.json suricata:json logger
4 suricata~0~E9D1DED8-2B5E-4CE4-B073-6EDDE62A2180 0:2384 1647387297 {"timestamp":"2022-03-15T23:32:44.103102+0000"... 4 [logger, suricata] suricata:json .103102 2022-03-15T23:32:44.103+00:00 logger suricata 1 /var/log/suricata/eve.json suricata:json logger
5 suricata~0~E9D1DED8-2B5E-4CE4-B073-6EDDE62A2180 0:2368 1647387297 {"timestamp":"2022-03-15T23:32:44.103095+0000"... 5 [logger, suricata] suricata:json .103095 2022-03-15T23:32:44.103+00:00 logger suricata 1 /var/log/suricata/eve.json suricata:json logger
6 suricata~0~E9D1DED8-2B5E-4CE4-B073-6EDDE62A2180 0:2352 1647387297 {"timestamp":"2022-03-15T23:32:44.103089+0000"... 6 [logger, suricata] suricata:json .103089 2022-03-15T23:32:44.103+00:00 logger suricata 1 /var/log/suricata/eve.json suricata:json logger
7 suricata~0~E9D1DED8-2B5E-4CE4-B073-6EDDE62A2180 0:2336 1647387297 {"timestamp":"2022-03-15T23:32:44.103078+0000"... 7 [logger, suricata] suricata:json .103078 2022-03-15T23:32:44.103+00:00 logger suricata 1 /var/log/suricata/eve.json suricata:json logger
8 suricata~0~E9D1DED8-2B5E-4CE4-B073-6EDDE62A2180 0:2319 1647387297 {"timestamp":"2022-03-15T23:32:44.103064+0000"... 8 [logger, suricata] suricata:json .103064 2022-03-15T23:32:44.103+00:00 logger suricata 1 /var/log/suricata/eve.json suricata:json logger
9 suricata~0~E9D1DED8-2B5E-4CE4-B073-6EDDE62A2180 0:2302 1647387297 {"timestamp":"2022-03-15T23:32:44.103006+0000"... 9 [logger, suricata] suricata:json .103006 2022-03-15T23:32:44.103+00:00 logger suricata 1 /var/log/suricata/eve.json suricata:json logger
In [29]:
import splunklib.results as results

kwargs_oneshot = {"earliest_time": "2021-10-1T12:00:00.00", "latest_time": "2022-10-1T12:00:00.00"}
searchquery_oneshot = "search index=zeek | head 10"

oneshotsearch_results = service.jobs.oneshot(searchquery_oneshot, **kwargs_oneshot)

reader = results.ResultsReader(oneshotsearch_results)
d = splunk_to_pandas(reader)
d
Out[29]:
_bkt _cd _indextime _raw _serial _si _sourcetype _subsecond _time host index linecount source sourcetype splunk_server
0 zeek~1~E9D1DED8-2B5E-4CE4-B073-6EDDE62A2180 1:25307 1647391442 {"ts":1647391439.448863,"uid":"ChwtJl1kqAQootk... 0 [logger, zeek] zeek:json .448863 2022-03-16T00:43:59.448+00:00 logger zeek 1 /opt/zeek/spool/manager/ssl.log zeek:json logger
1 zeek~1~E9D1DED8-2B5E-4CE4-B073-6EDDE62A2180 1:25269 1647391442 {"ts":1647391439.448863,"uid":"CBzMvfpOZtcylE4... 1 [logger, zeek] zeek:json .448863 2022-03-16T00:43:59.448+00:00 logger zeek 1 /opt/zeek/spool/manager/ssl.log zeek:json logger
2 zeek~1~E9D1DED8-2B5E-4CE4-B073-6EDDE62A2180 1:25284 1647391442 {"ts":1647391439.013609,"uid":"CYdh5Pv28gLesFo... 2 [logger, zeek] zeek:json .013609 2022-03-16T00:43:59.013+00:00 logger zeek 1 /opt/zeek/spool/manager/ssl.log zeek:json logger
3 zeek~1~E9D1DED8-2B5E-4CE4-B073-6EDDE62A2180 1:25245 1647391442 {"ts":1647391439.013609,"uid":"CWxOT51gUL6S9F0... 3 [logger, zeek] zeek:json .013609 2022-03-16T00:43:59.013+00:00 logger zeek 1 /opt/zeek/spool/manager/ssl.log zeek:json logger
4 zeek~1~E9D1DED8-2B5E-4CE4-B073-6EDDE62A2180 1:25229 1647391442 {"ts":1647391438.696717,"uid":"CiArdp4VpF78enx... 4 [logger, zeek] zeek:json .696717 2022-03-16T00:43:58.696+00:00 logger zeek 1 /opt/zeek/spool/manager/ssl.log zeek:json logger
5 zeek~1~E9D1DED8-2B5E-4CE4-B073-6EDDE62A2180 1:25214 1647391442 {"ts":1647391438.696717,"uid":"CfMsvor6KIC1GHJ... 5 [logger, zeek] zeek:json .696717 2022-03-16T00:43:58.696+00:00 logger zeek 1 /opt/zeek/spool/manager/ssl.log zeek:json logger
6 zeek~1~E9D1DED8-2B5E-4CE4-B073-6EDDE62A2180 1:25112 1647391439 {"ts":1647391436.877053,"uid":"CAeyRuKRL3MIDEX... 6 [logger, zeek] zeek:json .877053 2022-03-16T00:43:56.877+00:00 logger zeek 1 /opt/zeek/spool/manager/ssl.log zeek:json logger
7 zeek~1~E9D1DED8-2B5E-4CE4-B073-6EDDE62A2180 1:25088 1647391439 {"ts":1647391436.874396,"uid":"Cmtgy53JFDOGT7F... 7 [logger, zeek] zeek:json .874396 2022-03-16T00:43:56.874+00:00 logger zeek 1 /opt/zeek/spool/manager/ssl.log zeek:json logger
8 zeek~1~E9D1DED8-2B5E-4CE4-B073-6EDDE62A2180 1:25401 1647391444 {"ts":1647391436.868701,"uid":"CAeyRuKRL3MIDEX... 8 [logger, zeek] zeek:json .868701 2022-03-16T00:43:56.868+00:00 logger zeek 1 /opt/zeek/spool/manager/conn.log zeek:json logger
9 zeek~1~E9D1DED8-2B5E-4CE4-B073-6EDDE62A2180 1:25362 1647391444 {"ts":1647391436.868701,"uid":"Cmtgy53JFDOGT7F... 9 [logger, zeek] zeek:json .868701 2022-03-16T00:43:56.868+00:00 logger zeek 1 /opt/zeek/spool/manager/conn.log zeek:json logger