EECS 7700 - Schedule

Week 13: Knowledge Integration

Reading tasks
A Semantic Loss Function for Deep Learning with Symbolic Knowledge [ Link ]
Harnessing Deep Neural Networks with Logic Rules [ Link ]
Informed Machine Learning - A Taxonomy and Survey of Integrating Prior Knowledge into Learning Systems [ Link ]

Blog Post 21: Informed Machine Learning
It proposed an abstract concept for informed machine learning that clarifies its building blocks and relation to conventional machine learning. It states that informed learning uses a hybrid information source that consists of data and prior knowledge, which comes from an independent source and is given by formal representations. The main contribution is introducing a novel, first-of-its-kind taxonomy that classifies informed machine learning approaches. The paper described available approaches and explained how different knowledge representations, such as algebraic equations, logic rules, or simulation results, can be used in informed machine learning. [Read more ...]

Blog Post 20: Harnessing DNN with Logic Rules
The paper proposed a general framework for enhancing various types of neural networks (e.g., CNNs and RNNs) with declarative first-order logic rules. Specifically, the authors developed an iterative distillation method that transfers the structured information of logic rules into the weights of neural networks. It deployed the framework on a CNN for sentiment analysis and an RNN for named entity recognition. This work substantially improved with a few highly intuitive rules and achieved state-of-the-art or comparable results to previous best-performing systems. [Read more ...]

Blog Post 19: Semantic Losss
This paper discusses a method of improving the integration of symbolic logic into a neural network loss function by factoring in an additional "semantic loss function" without modifying the actual model or dataset(s) utilized. The utility of symbolic knowledge is that it improves user interpratability in the loss function without sacrificing too much accuracy, but as neural networks tend to struggle with this information being directly provided, this paper's approach helps alleviate that issue. The concept of semantic loss is described along with explanations of the experimentation on data for semi-supervised classification, as well as more complex scenarios involving reasoning. [Read more ...]

Week 12: ML Interpretebility

Reading tasks
Why Should I Trust You?” Explaining the Predictions of Any Classifier [ Link ]
LEMNA: Explaining Deep Learning based Security Applications [ Link ]

Blog Post 18: LEMNA
This paper addresses the lack of explainability in deep learning models, which is particularly problematic for security and safety-critical applications. Some models have already been defined, which are useful is some cases. In others, such as security issues, these do not perform well. The paper also provides the application of LEMNA on two tasks: a malware classifier and a function start detector for binary reverse engineering. The results show the superior performance of LENMA compared to other existing methods. The paper introduces LEMNA, a high-fidelity explanation method tailored for security applications, specifically to enhance the interpretability of deep learning decisions. Some key features that LEMNA presents are its interpretable features, its handling of feature dependency (contratry to other common methods that usually assume feature independcy), local decision boundary approximatiom, and its ability to overcome nonlinear boundaries. [Read more ...]

Blog Post 17: Why Should I Trust You
The paper discusses the importance of understanding the reasons behind machine learning model predictions. This understanding is crucial for building trust in the model, especially when relying on its predictions for decision-making. The authors propose LIME, a technique that explains model predictions by learning a simpler, interpretable model locally around the prediction. They also introduce a method to select representative individual predictions and their explanations to provide a concise and informative overview of the model's behavior. The effectiveness of these methods is demonstrated through various experiments, including simulated and human-subject studies, in scenarios that require trust in machine learning models. These scenarios include deciding on the reliability of a prediction, selecting between different models, improving untrustworthy models, and identifying potential biases or limitations in the model. [Read more ...]

Week 11: Reinforcement Learning

Reading tasks
Mastering the game of Go with deep neural networks and tree search [ Link ]
Human-level control through deep reinforcement learning [ Link ]

Blog Post 16: Deep RL
The Deep Q-Network (DQN) combined reinforcement learning with deep neural networks, achieving human-level performance across various complex tasks by learning directly from high-dimensional sensory inputs like raw pixels. It utilized a deep convolutional neural network to process raw game pixels through convolutional layers, outputting Q-values for possible actions without requiring hand-crafted features. Stability in training was ensured through experience replay, which stored and randomly sampled experiences to break data correlations and improve efficiency, and a separate target network to generate target Q-values, reducing instability and oscillations. [Read more ...]

Blog Post 15: Alpha-Go
This paper introduces a novel sequence transduction model architecture named the Transformer. This architecture is based solely on attention mechanisms, eliminating the need for recursion and convolution. The model addresses the limitations of sequence models that rely on recursive processes, which perform poorly in parallelization and computational efficiency for longer sequences. The Transformer adopts an encoder-decoder structure, where the encoder consists of identical layers with multi-head self-attention and fully connected feed-forward networks, while the decoder mirrors this structure but adds a multi-head attention layer on the encoder's output; utilizing scaled dot-product attention and multi-head attention, the model computes the importance of key-value pairs based on queries and allows joint attention across different subspaces, with encoder-decoder attention enabling the decoder to focus on all input positions, self-attention improving contextual understanding by attending to all positions within layers, and positional encodings ensuring the model captures the order of tokens in a sequence. [Read more ...]

Week 10: Adversarial ML

Reading tasks
Runtime Stealthy Perception Attacks against DNN-based Adaptive Cruise Control Systems [ Link ]
Towards Deep Learning Models Resistant to Adversarial Attacks [ Link ]
Generative Adversarial Nets [ Link ]

Blog Post 14: GANs
The authors of this paper present their innovative Generative Adversarial Nets, or GANs. This process utilizes adversarial training by training two models simultaneously- one generative model and one discriminative model. These models achieve different goals that when used together, are able generate "synthetic" output that is similar to real data. [Read more ...]

Blog Post 13: Adversarial Attacks
With adversaial attacks being considered a potentially inherent weakness in neural networks, this paper studies and optimizes the robustness of neural networks with respect to adversarial attacks. Through the authors' efforts, a reliable and "universal" solution is presented which significantly improves the resistance to a wide range of adversarial attacks. [Read more ...]

Blog Post 12: Perception Attacks
The authors of this paper establish a new kind of adversarial patch attack they call CA-Opt. This attack focuses on causing small perturbations in the camera images of autonomous vehicles to cause vast increases in the loss of DNN-based autonomous driving models. By assuming attackers had access to the ACC system in use, the authors used this manipulation of the system to prove that CA-Opt can cause car crashes in a variety of situations while remaining undetected by safety mechanisms and human drivers alike. The simulation they used implemented OpenPilot and CARLA to better model real-world scenarios and show that this input attack on autonomous driving systems is stealthier and potentially deadlier than standard baseline methods dealing with command control manipulations and choosing random variables to initialize optimization. [Read more ...]

Week 9: Safety Monitoring in CPS

Reading tasks
Specification-Based Monitoring of Cyber-Physical Systems: A Survey on Theory, Tools and Applications [ Link ]
Hybrid Knowledge and Data Driven Synthesis of Runtime Monitors for Cyber-Physical Systems [ Link ]

Blog Post 11: Specification-Based Monitoring
A presentation on monitoring cyber-physical systems (CPS) using specification-based approaches to ensure systems meet predefined criteria under varying conditions. [Read more ...]

Week 6: Anomaly/Intrution Detection

Reading tasks
A hybrid methodology for anomaly detection in Cyber–Physical Systems [ Link ]
Cybersecurity Challenges in the Offshore Oil and Gas Industry: An Industrial Cyber-Physical Systems (ICPS) Perspective [ Link ]
Self-Configurable Cyber-Physical Intrusion Detection for Smart Homes Using Reinforcement Learning [ Link ]

Blog Post 9: Intrusion Detection
The presented paper titled “Self-Configurable Cyber-Physical Intrusion Detection for Smart Homes Using Reinforcement Learning” addresses the challenges of securing IoT-based smart homes from cyber threats in a continuously changing environment. The addition of new Commercial Off-the-Shelf (COTS) devices to the network, each using different network protocols, introduces new vulnerabilities that often remain unpatched. Along with varying user interactions and differing cyber risk attitudes, this creates a highly challenging environment for security. The authors argue that intrusion detection in this rapidly changing environment cannot rely entirely on static models. Therefore, they propose the Monitoring Against Cyber-Physical Threats (MAGPIE) system, which autonomously adjusts the decision function of anomaly classification models using a novel probabilistic cluster-based reward mechanism in non-stationary multi-armed bandit reinforcement learning, improving adaptability to changing conditions. MAGPIE rewards the hyperparameters of its underlying isolation forest unsupervised anomaly classifiers based on the cluster silhouette scores of their output. MAGPIE achieves higher accuracy compared to previous works because it considers both cyber and physical data sources, as well as the factor of human presence. To validate the results, the authors integrated MAGPIE into a real household with three members, where seven different types of attacks—including WiFi deauth, ZigBee jamming, and malware audio injection—were periodically launched to update the model during both normal and abnormal behavior. The result strengthens the original hypothesis by demonstrating improved accuracy when integrating both cyber and physical data, as well as factoring in human presence. [Read more ...]

Blog Post 8: Cybersecurity Challenges in the Offshore Oil and Gas Industry
The paper presents the viewpoint of Oil & Gas (O&G) on Industrial Cyber-Physical Systems (ICPS). Many O&G companies rely on a combination of ICPS, Supervisory, Control and Data Acquisition (SCADA) systems, and IIoT technologies to enable remote operation and control of sites. There are many valuable assets at these locations and any accidents can have drastic consequences. Additionally, disrupting these facilities can have far reaching supply chain effects. This makes the O&G industry a very high value target for cyber attacks. This paper covers the unique challenges of the O&G industry, their vulnerabilities, and discusses a case study for subsea control system. [Read more ...]

Blog Post 7: Anomaly Dection
The paper presents a hybrid methodology for detecting security threats in Cyber-Physical Systems (CPS). It combines signature-based, threshold-based, and behavior-based detection techniques. The hybrid model leverages signature and threshold-based methods to detect known threats and uses machine learning (KNN, SVM) to identify unknown anomalies by learning system behavior. Experiments show that the one-class KNN model achieves the highest detection accuracy, making it suitable for CPS environments where attack data is rare. The hybrid approach proves effective for improving anomaly detection in CPS. [Read more ...]

Week 5: Data

Reading tasks
Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks [ Link ]

Blog Post 6: Feature Squeezing
The authors provide an overview of a technique called feature squeezing, which is used to detect adversarial examples in neural networks. Adversarials are subtle and can cause neural networks to malfunction and misclassify. Feature squeezing involves compressing the input feature space, making it more difficult for these adversarials to go unnoticed while also trying to preserve the original accuracy of the model. It involves transformations like bit depth reduction and spatial smoothing. The paper also explored attack methods including FGSM, BIM, and DeepFool, and evaluated the feature squeezing technique across different types of datasets. While the technique does have some limitations when given complex datasets, it remains a viable defense strategy against adversarial attacks. [Read more ...]

Week 4: Safety Validation in CPS

Reading tasks
A survey of algorithms for black-box safety validation of cyber-physical systems [ Link ]
AI Psychiatry: Forensic Investigation of Deep Learning Networks in Memory Images [ Link ]

Blog Post 5: Black-Box Safety Validation
The presentation reviews methods for ensuring the safety of autonomous cyber-physical systems (CPS), such as self-driving cars and aircraft, by treating the system as a black box in simulation environments. It highlights three main tasks: falsification (finding failure-inducing disturbances), most-likely failure analysis, and failure probability estimation. The paper discusses optimization techniques, path planning (like rapidly-exploring random trees), reinforcement learning, and importance sampling as key methods for black-box safety validation, emphasizing the challenges of scaling these methods to large, complex systems. It also surveys tools used for safety validation in critical CPS applications, with a focus on scalability, adaptability, and efficiency in testing rare failures.r analysis. [Read more ...]

Blog Post 4: AI Psychiatry
This presentation explores the forensic analysis of deep learning models using a novel technique called AiP (AI Psychiatry). AiP is designed to recover machine learning models from memory images, which is critical for investigating models that have been compromised or attacked. This process is especially important for understanding models in production environments. AiP supports popular frameworks such as TensorFlow and PyTorch and has demonstrated 100% accuracy in recovering models from memory for further analysis. [Read more ...]

Week 3: Machine Learning Applications

Reading tasks
Deep Residual Learning for Image Recognition [ Link ]
Attention Is All You Need [ Link ]
Privacy Auditing with One (1) Training Run [ Link ]

Blog Post 3: Privacy Auditing
The Presented paper "Privacy auditing with One Training Run" address the challenges presented by privacy auditing methods on machine learning models. Privacy auditing refers to the assessment of a model's vulnerability to privacy attacks such as the membership inference attacks tested for in this paper. In such attacks, an adversaries tries to infer if specific data points were used in training the model, which could amount to data and privacy breach. However, traditional methods for auditing require multiple training runs, which can be computationally expensive and sometimes infeasible. Hence, the authors propose a novel approach that only uses a single training run, which significantly improves the auditing time by orders of magnitude. The proposed method employs the use of "canary" data points, that are iteratively tested for on the model, to estimate the likelihood of potential breaches. The paper presents that by optimizing the differential privacy parameters such as Epsilon(Privacy loss) and Delta(Signal noise), the model aims to maintain a balance between privacy protection and accuracy. To validate the results, the authors used a wide ResNet model trained on CIFAR-10 dataset under both white-box and black-box settings, and showcased significant estimation capability in white-box testing. While this methods vastly improves privacy auditing, trade-offs are noted in privacy guarentees where tightness of the bounds was comparatively lower compared to other models. [Read more ...]

Blog Post 2: Transformer
This paper introduces a novel sequence transduction model architecture named the Transformer. This architecture is based solely on attention mechanisms, eliminating the need for recursion and convolution. The model addresses the limitations of sequence models that rely on recursive processes, which perform poorly in parallelization and computational efficiency for longer sequences. The Transformer adopts an encoder-decoder structure, where the encoder consists of identical layers with multi-head self-attention and fully connected feed-forward networks, while the decoder mirrors this structure but adds a multi-head attention layer on the encoder's output; utilizing scaled dot-product attention and multi-head attention, the model computes the importance of key-value pairs based on queries and allows joint attention across different subspaces, with encoder-decoder attention enabling the decoder to focus on all input positions, self-attention improving contextual understanding by attending to all positions within layers, and positional encodings ensuring the model captures the order of tokens in a sequence. [Read more ...]

Blog Post 1: ResNet
As the number of layers of neural networks increases, the problems of overfitting, gradient vanishing, and gradient explosion often occur, so this article came into being. In this paper, the concept of deep residual networks (ResNets) is proposed. By introducing "shortcut connections," this study solves the problem of gradient vanishing in deep network training and has an important impact on the field of deep learning. The method of the paper explicitly redefines the network layers as learning residual functions relative to the inputs. By learning residuals, the network can be optimized more easily and can train deeper models more efficiently. Therefore, this method can help solve the performance degradation problem that may occur when the network layer increases. In addition, the article displays the experimental part. The model shows significant improvements in handling large-scale visual recognition tasks like ImageNet and CIFAR-10. The application of deep residual networks in major visual recognition competitions like ILSVRC and COCO 2015 further proves their power and wide applicability. [Read more ...]