The following is the Visual interpretation history of image recognition recommended by recordtrend.com. And this article belongs to the classification: professional knowledge.
How has the latest method used to explain neural networks developed in the past 11 years?
This paper attempts to use guided back propagation to explain and demonstrate on the perception network image classifier.
Why is “explanation” important?
One of the biggest challenges in image recognition using machine learning (ML) algorithms (especially modern deep learning) is that it is difficult to understand why a particular input image will produce its predicted results.
Users of ML model usually want to know which parts of the image are important factors in prediction. There are many reasons why these explanations are valuable
Machine learning developers can analyze the interpretation of debugging models, identify deviations, and predict whether the models are likely to be extended to new images
Users of the machine learning model may trust the model more if they provide an explanation for why they make a particular prediction
Rules around machine learning such as gdpr require some algorithmic decisions to be explained in human terms
Therefore, since at least 2009, researchers have developed many different methods to open the “black box” of deep learning, so as to make the basic model easier to explain.
Next, we integrate the visual interface for the most advanced image interpretation technologies in the past decade, and briefly describe each technology.
We use a lot of great libraries, but we rely on gradio in particular to create the interfaces you see in the GIF file below and in the tensorflow implementation of pair code.
The model for all interfaces is the perception net image classifier. You can find the complete code to copy this blog post on this jupyter notebook and colab.
Before we go deep into the paper, let’s start with a very basic algorithm.
Seven different interpretation methods
Leave one out (loo) is one of the easiest ways to understand. If you want to know which part of the image is responsible for prediction, this may be the first algorithm you come up with.
The idea is to segment the input image into a group of smaller regions, and then run multiple predictions to mask one region each time. According to the influence of each area’s “masked” on the output, an importance score is assigned to each area. These scores quantify which region is most responsible for forecasting.
This approach is slow because it relies on many iterations of the running model, but it can produce very accurate and useful results. Above is an example of a picture of a Doberman.
Loo is the default interpretation technique in the graph library, and there is no need to access the interior of the model at all – which is a big advantage.
Vanilla Gradient Ascent [2009 and 2013]
Paper: Visualizing Higher-Layer Features of a Deep Network 
Paper: Visualizing Image Classification Models and Saliency Maps 
The similarity between the two papers is that they both explore the interior of neural networks by using gradient ascent. In other words, they believe that small changes to input or activation will increase the likelihood of predicting categories.
The first paper applies it to activation. The authors report that “it is possible to find a good qualitative explanation for high-level features. We prove that it may be counterintuitive, but this explanation is possible at the unit level, it is easy to achieve, and the results of various techniques are consistent. 」
The second method also uses gradient rise, but directly detects the pixels of the input image instead of activating them.
The author’s method “computes class saliency graphs specific to a given image and class, which can be used for weakly supervised object segmentation using classification ConvNets. 」
Guided Back-Propogation 
Paper: Striving for Simplicity: The All Convolutional Net 
In this paper, a new neural network composed entirely of convolution layers is proposed. Because the previous interpretation method is not suitable for their network, they introduce guided back propagation.
The back propagation can filter out the negative activation when the standard gradient rises. The authors claim that their approach “can be applied to a wider range of network structures. 」
Paper: Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization 
Next is gradient weighted class activation mapping (Grad CAM). It uses the gradient of any target concept to flow into the final convolution layer, and generates a rough location map to highlight the important areas in the image to predict the concept. 」
The main advantage of this method is that it further extends the interpretable neural network (such as classification network, subtitle and VQA) model) and a good post-processing step, which focuses and locates the interpretation around the key objects in the image.
Paper: SmoothGrad: removing noise by adding noise 
As in the previous paper, this method starts by calculating the gradient of the class score function relative to the input image.
However, smoothgrad visually sharpens these gradient based sensitivity maps by adding noise to the input image and then calculating gradients for each of these disturbed versions of the image. The sensitivity maps can be averaged together to get clearer results.
Integrated Gradients 
Paper: Axiomatic Attribution for Deep Networks 
Different from previous papers, the author of this paper starts with the theoretical basis of interpretation. They “determine the two basic axioms that attribution methods should satisfy sensitivity and implementation invariance”.
They use these principles to guide the design of a new attribution method (called synthetic gradient), which can produce high-quality interpretation, and still only need to access the gradient of the model; however, it adds a “baseline” super parameter, which may affect the quality of the results.
Blur Integrated Gradients 
Paper: Attribution in Scale and Space 
This paper studies a new technique, which is proposed to solve specific problems, including eliminating “baseline” parameters and removing some visual artifacts that tend to appear in interpretation.
In addition, it also generates fractions in the scale / frequency dimension, which essentially provides a sense of scale for important objects in the image.
The following chart compares all of these methods:
Read more: ICML: visualization of papers received in 2019, Facebook and other high-tech enterprises continue to invest in the development of image recognition technology. Five supports of image recognition for Industry 4.0 Research: it is estimated that the global image recognition market will reach 42.2 billion US dollars in 2022. Kaidi data: if we want to draw a picture for the government, how should we draw it? Market and markets: it is estimated that the global market of visualization and 3D rendering software will reach 1.6 billion US dollars in 2020
If you want to get the full report, you can contact us by leaving us the comment. If you think the information here might be helpful to others, please actively share it. If you want others to see your attitude towards this report, please actively comment and discuss it. Please stay tuned to us, we will keep updating as much as possible to record future development trends.
RecordTrend.com is a website that focuses on future technologies, markets and user trends. We are responsible for collecting the latest research data, authority data, industry research and analysis reports. We are committed to becoming a data and report sharing platform for professionals and decision makers. We look forward to working with you to record the development trends of today’s economy, technology, industrial chain and business model.Welcome to follow, comment and bookmark us, and hope to share the future with you, and look forward to your success with our help.