This report questions the limitation of current machine learning techniques for detecting attacks. We started looking into systems that combine semantics with machine learning techniques in the literature. Computer scientists  adapted the military concept of kill chain, a systematic process to attack an adversary, to describe the stages of cyberattacks. Similar kill chain models (a.k.a. intrusion or cyber kill chains) are often used to represent attacks at a high level   . This is why we looked at such models as a possible line of approach for our work.
Previous work on AI for cybersecurity managed to improve detection accuracy by working on the statistical aspect of a system (feature extraction  , models combination ).
Current tools may raise many false alarms. When dealing with a series of false alarms, human experts may overload and consider all the following alarms as false as well.
Security analysts often consider a flow to be either legitimate or illegitimate. Dealing with the grey zone can be tricky. For example, a network scan can be legitimate if it is done by an admin. In other cases, it can be the starting point of an attack. A human expert needs elements of context to decide if it is a false alarm or if we must remain on our guard. Current security tools won’t be able to make the nuance. In addition, a human analyst is able to relate events with the goal of an attacker (stealing data, denying service). That is why we want to investigate how we can add semantics to our system.
Combining semantics with machine learning techniques
Other works try to add semantics in IA-based systems.  notes that current AI-based cybersecurity tools lack trust. The authors stated that the combination of logic-based approaches (ontologies) with probabilistic machine learning is a promising approach for a more trusted and explainable AI.
Although false positives are often considered less critical than false negatives, multiple works attempted to reduce them because they waste analysts’ time  . However, to the best of our knowledge, few studies attempted to explain the false positive of their proposed system (i.e. answering the question “why our system detected those specific samples as attacks when there were not?”). In , the authors found out that they corresponded to cases where the user was active very late after working hours. So, there was indeed an anomaly that may have been worth reporting, but in that case, it was not malicious.
 uses an ontology to describe the communication between a vehicle and the rest of the world (entities). The use of an ontology allowed reducing the number of features (extracting features for flows, frames, packets). Then, using inference rules eased the anomaly detection and built contextual information.
 considered network traffic as a character stream of words and assumed normal and attack traffic can be distinguished by a sequence of words. They encoded raw traffic data (rather than feature vectors to preserve the semantics) and used a classifier to distinguish normal and attack traffic. They also proposed an encoding for feature vectors, but the anomaly detection performance was worse than with encoded raw data. Maybe the raw data contained something that was not in the extracted features that helped the detection?
 describes a system combining a knowledge model with multiple machine learning models. The system uses a taxonomy of attack types and subtypes. Their ontology allows navigating between those attack types and queries the relevant machine learning models.
The process to classify an attack is the following: first, the system operates an attack/normal separation. Then, the system processes the taxonomy to select the relevant attack classes and the appropriate models to detect those classes. Finally, the ontology is queried to select all the possible sub-types of the attack and their related models.
Some seemingly benign operations may be steps of an attack (e.g. scans or SMB communication). Raising an alert early enough to give security analysts time to respond while avoiding flooding them with false alerts is difficult.
In order to detect and prevent cyberattacks, work has been done to describe their lifecycle  . The MITRE ATT&CK Framework  describes the steps of an attack and lists techniques associated with each step.  proposes a system that detects ongoing attacks from low-level events and generates a high-level graph that summarizes the attack steps.
Currently, our models don’t use previous detections to support the current detection. Hierarchical temporal memory (HTM), used in , keeps and uses information from
previous stages to detect anomalies.
An attack may consist of a rare succession of frequent events. As we said, other works tried to identify stages that are common to every attack ( describes multiple attack scenarios). We are wondering if an IA can extract knowledge about attack patterns, and follow the attack in an automated way.
We will look into works that generalize the functioning of an attack and try to characterize sequences of packets.
We believe current machine learning techniques alone may not be enough to catch the logic of a cyberattack. This is why we started investigating works that combine semantics with machine learning techniques. Previous work intended to generalize the lifecycle of an attack. Mapping attack stages with low-level events allow presenting the information in a more understandable way for a human expert.
However, we wonder if there are attacks that can escape the logic described in the literature. For this reason, we would like to investigate the use of machine learning techniques to identify attack patterns.
1. Hutchins, E. M., Cloppert, M. J., & Amin, R. M. (2011). Intelligence-Driven Computer Network Defense Informed by Analysis of Adversary Campaigns and Intrusion Kill Chains. 14.
2. Milajerdi, S. M., Gjomemo, R., Eshete, B., Sekar, R., & Venkatakrishnan, V. (2019). HOLMES: Real-Time APT Detection through Correlation of Suspicious Information Flows. 2019 IEEE Symposium on Security and Privacy (SP), 1137–1152. https://doi.org/10.1109/SP.2019.00026
3. The MITRE Corporation. (n.d.-). https://www.mitre.org/.
4. Darktrace Cyber AI Analyst: Autonomous Investigations. (n.d.-).
5. Kiran, M., Wang, C., Papadimitriou, G., Mandal, A., & Deelman, E. (2020). Detecting anomalous packets in network transfers: Investigations using PCA, autoencoder and isolation forest in TCP. Machine Learning, 109(5), 1127–1143. https://doi.org/10.1007/s10994-020-05870-y
6. Dromard, J., Roudière, G., & Owezarski, P. (2017). Online and Scalable Unsupervised Network Anomaly Detection Method. IEEE Transactions on Network and Service Management, 14(1), 34–47. https://doi.org/10.1109/TNSM.2016.2627340
7. Vanerio, J., & Casas, P. (2017). Ensemble-learning Approaches for Network Security and Anomaly Detection. Proceedings of the Workshop on Big Data Analytics and Machine Learning for Data Communication Networks, 1–6. https://doi.org/10.1145/3098593.3098594
8. Holzinger, A., Kieseberg, P., Weippl, E., & Tjoa, A. M. (2018). Current Advances, Trends and Challenges of Machine Learning and Knowledge Extraction: From Machine Learning to Explainable AI. In A. Holzinger, P. Kieseberg, A. M. Tjoa, & E. Weippl (Eds.), Machine Learning and Knowledge Extraction (Vol. 11015, pp. 1–8). Springer International Publishing. https://doi.org/10.1007/978-3-319-99740-7_1
9. Kathareios, G., Anghel, A., Mate, A., Clauberg, R., & Gusat, M. (2017). Catch It If You Can: Real-Time Network Anomaly Detection with Low False Alarm Rates. 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA), 924–929. https://doi.org/10.1109/ICMLA.2017.00-36
10. Le, D. C., & Zincir-Heywood, N. (2021). Anomaly Detection for Insider Threats Using Unsupervised Ensembles. IEEE Transactions on Network and Service Management, 18(2), 1152–1164. https://doi.org/10.1109/TNSM.2021.3071928
11. Ricard, Q., & Owezarski, P. (2020, January). Ontology Based Anomaly Detection for Cellular Vehicular Communications. 10th European Congress on Embedded Real Time Software and Systems (ERTS 2020).
12. Wu, Z., Wang, J., Hu, L., Zhang, Z., & Wu, H. (2020). A network intrusion detection method based on semantic Re-encoding and deep learning. Journal of Network and Computer Applications, 164, 102688. https://doi.org/10.1016/j.jnca.2020.102688
13. Sarnovsky, M., & Paralic, J. (2020). Hierarchical Intrusion Detection Using Machine Learning and Knowledge Model. Symmetry, 12(2), 203. https://doi.org/10.3390/sym12020203