Machine Learning Attack Series: Repudiation Threat and Auditing

This post is part of a series about machine learning and artificial intelligence. Click on the blog tag “huskyai” to see related posts.

  • Overview: How Husky AI was built, threat modeled and operationalized
  • Attacks: The attacks I want to investigate, learn about, and try out

In this post we are going to look at the “Repudiation Threat”, which is one of the threats often overlooked when performing threat modeling, and maybe something you would not even expect in a series about machine learning.

What is Repudiation?

Repudiation is the threat that someone denies having performed an action.

For example, in the case of Husky AI the attacker Mallory replaces the original machine learning model file with a backdoored one, but Mallory just ends up denying having done such a thing!

Audit

How can we add proper auditing to better understand who and when such an action was performed? And how can we have proof which account indeed updated the file, or at least support an investigation to help put the pieces together to uncover the truth.

Auditing, Centralized Monitoring and Notifications

A practical way is to leverage auditd in Linux, and push log files up into a centralized monitoring system.

Companies typically use products such as Splunk, the Elastic Stack, or Azure Sentinel (to name a few systems) that can help logging, auditing, analyze and visualize the audit data. Research which product your organization is using and then integrate accordingly.

I will discuss the Linux Audit Daemon auditd in this post, which is what I am using with Husky AI to monitor file access.

Audit Daemon - auditd

To see if auditd is installed already on your Linux (Ubuntu) machine you can run:

sudo apt list --installed | grep auditd

And it should show something like this if it’s installed:

auditd/bionic,now 1:2.8.2-1ubuntu1 amd64 [installed]

If not, it’s easy to install using:

sudo apt install auditd

Now auditd should be started. You can double check by running sudo service auditd status. If for some reason is not running, it can be started using sudo service auditd start.

Monitoring a specific file

Auditd is configured using rules to determine what files and what kind of access will be audited and logged.

With auditctl you can look at the currently configured rules:

sudo auditctl -l

No rules

Now, to add files for auditing you can either modify the configuration file of auditd (e.g sudo cat /etc/audit/rules.d/audit.rules), or use the auditctl command.

Let’s monitor the model file:

sudo auditctl -w /var/www/huksyai/models/huskymodel.h5 -p rwa -k huskyai
  • -w specifies the path of the file to audit
  • -p specifies what operations should be audited, in this case read, write and append
  • -k tells auditd to tag each audit entry with the provided string. This is useful for searching

Now we are ready to simulate someone accessing the file:

cp /var/www/huskyai/models/huskymodel.h5 /tmp/

And using utility ausearch we can take a look at audit events:

sudo ausearch -i -k huskyai 

The -i argument interprets the user ids with the actual account name to make it human readable, and -k filters by the keyword used to tag events.

The results look like the following:

type=PROCTITLE msg=audit(11/10/20 20:09:44.150:414) : proctitle=cp /var/www/huskyai/models/huskymodel.h5 /tmp/exfil 
type=PATH msg=audit(11/10/20 20:09:44.150:414) : item=0 name=/var/www/huskyai/models/huskymodel.h5 inode=256232 dev=ca:01 mode=file,664 ouid=ubuntu ogid=ubuntu rdev=00:00 nametype=NORMAL cap_fp=none cap_fi=none cap_fe=0 cap_fver=0 cap_frootid=0 
type=CWD msg=audit(11/10/20 20:09:44.150:414) : cwd=/var/www/huskyai/models 
type=SYSCALL msg=audit(11/10/20 20:09:44.150:414) : arch=x86_64 syscall=openat success=yes exit=3 a0=0xffffff9c a1=0x7ffd37ab0466 a2=O_RDONLY a3=0x0 items=1 ppid=6007 pid=6914 auid=ubuntu uid=ubuntu gid=ubuntu euid=ubuntu suid=ubuntu fsuid=ubuntu egid=ubuntu sgid=ubuntu fsgid=ubuntu tty=pts0 ses=108314 comm=cp exe=/bin/cp key=huskyai 

As can be seen the audit entry contains time, username and the details of the operation, in this case /bin/cp. This information now can be correlated with logon times, and other data in case a real breach of the system occurred.

Pretty cool.

Monitoring directories

You can also monitor an entire directory not just a single file, e.g. to monitor all files in a directory add the following line to the audit.rules file:

-w /var/www/huskyai/

Quite simple.

Offloading events to a different machine

In a production environment you want to offload audit events quickly to a remote machine for security reasons. This is because an adversary might try to hide their tracks by deleting or tampering audit information.

If you are interested to learn more look at Filebeat and Auditbeat from Elastic for instance. Also, my book about Red Team Strategies contains more details on how to setup auditing - and also how to setup Honeypot files to trick adversaries!

Audit Dispatcher - Notifications

The audit framework can be extended with custom dispatchers!

If you are interested to learn more or build one that sends email notifications, check out my Github repo audisp-sentinel.

Notifications on critical assets are important to provide insights and highlight misuse. A practical way is also to have the blue team create summary reports with activity and share that with engineers, so they can spot potential misuse.

Conclusion

This was a brief introduction to auditing to highlight its importance to mitigate repudiation threats. We walked through the setup of auditd and monitor read and write access to the model file.

Hope it was useful. Cheers.

Twitter: @wunderwuzzi23

Appendix

These are the core ML threats for Husky AI that were identified in the threat modeling session so far and that I want to research and build attacks for.

Links will be added when posts are completed over the next several weeks/months.

  1. Attacker brute forces images to find incorrect predictions/labels
  2. Attacker applies smart ML fuzzing to find incorrect predictions
  3. Attacker performs perturbations to misclassify existing images
  4. Attacker gains read access to the model
  5. Attacker modifies persisted model file - Backdooring Attack
  6. Attacker denies modifying the model file - Repudiation Attack (this post)
  7. Attacker poisons the supply chain of third-party libraries
  8. Attacker tampers with images on disk to impact training performance
  9. Attacker modifies Jupyter Notebook file to insert a backdoor (key logger or data stealer)
  10. Attacker uses Generative Adversarial Networks to create fake husky images

References