Deep Residual Shrinkage Network no yɛ Deep Residual Network no version a yɛahyehyɛ no yiye. Ne kɛse no, ɛyɛ Deep Residual Network, attention mechanisms, ne soft thresholding functions a yɛaka abom.
Sɛ yɛreka a, sɛnea Deep Residual Shrinkage Network no yɛ adwuma no, yɛbetumi ate ase sɛ: ɛde attention mechanisms no hwɛ hu features a enhia (unimportant features), na ɛde soft thresholding functions no yɛ saa features no zero; na mmom, ɛhu features a ɛhia (important features) no na ɛkora so. Saa kwan yi boa deep neural network no ma ɛtumi yi useful features fi signals a noise wom no mu yiye.
1. Research Motivation
Nea edi kan no, sɛ yɛreyɛ classifying samples a, noise—te sɛ Gaussian noise, pink noise, ne Laplacian noise—bi wɔ hɔ a yɛntumi nkwati. Sɛ yɛbɛka no kɛse a, samples no taa kura information bi a ɛne classification task no nni hwee yɛ, na yɛbetumi afa no sɛ noise. Saa noise yi betumi asɛe classification performance no. (Soft thresholding yɛ step titiriw wɔ signal denoising algorithms pii mu.)
Ɛho nhwɛso ne sɛ, sɛ nkurɔfo reborɔ nkɔmmɔ wɔ kwan ho a, audio no mu betumi afra kar horns ne wheels nnyigyei. Sɛ yɛreyɛ speech recognition wɔ saa signals no so a, results no bɛhaw esiane saa background sounds no nti. Wɔ deep learning mu no, ɛsɛ sɛ yɛyi features a ɛkɔ ma horns ne wheels no fi deep neural network no mu na ammɛsɛe speech recognition results no.
Nea ɛtɔ so abien, wɔ dataset koro no ara mu mpo no, noise dodow no taa gu ahorow wɔ sample biara mu. (Wei ne attention mechanisms wɔ biribi di; sɛ yɛfa image dataset yɛ example a, target object no location betumi ayɛ soronko wɔ images no mu, na attention mechanisms no betumi afocuse wɔ target object no location pɔtee no so wɔ image biara mu.)
Ɛho nhwɛso ne sɛ, sɛ yɛretete (train) cat-and-dog classifier a, fa no sɛ images enum a yɛahyɛ no label sɛ “dog” wɔ hɔ. Image 1 betumi akura dog ne mouse, image 2 dog ne goose, image 3 dog ne chicken, image 4 dog ne donkey, na image 5 dog ne duck. Wɔ training no mu no, classifier no bɛhuan interference afi irrelevant objects te sɛ mice, geese, chickens, donkeys, ne ducks hɔ, na wei ma classification accuracy no kɔ fam. Sɛ yɛtumi hu saa irrelevant objects yi—mice, geese, chickens, donkeys, ne ducks—na yɛyi features a ɛkɔ ma wɔn no fi hɔ a, yɛbetumi ama cat-and-dog classifier no accuracy akɔ soro.
2. Soft Thresholding
Soft thresholding yɛ step titiriw wɔ signal denoising algorithms pii mu. Ɛyi features a ne absolute values sua sen threshold bi, na ɛtew features a ne absolute values so sen saa threshold no so kɔ zero. Yɛbetumi de formula a edidi so yi ayɛ:
\[y = \begin{cases} x - \tau & x > \tau \\ 0 & -\tau \le x \le \tau \\ x + \tau & x < -\tau \end{cases}\]Derivative ma soft thresholding output wɔ input no ho ne:
\[\frac{\partial y}{\partial x} = \begin{cases} 1 & x > \tau \\ 0 & -\tau \le x \le \tau \\ 1 & x < -\tau \end{cases}\]Sɛnea yɛahwɛ no, derivative ma soft thresholding no yɛ 1 anaasɛ 0. Wei te sɛ ReLU activation function no pɛpɛɛpɛ. Enti, soft thresholding no nso tumi te risk a deep learning algorithms de hyia gradient vanishing ne gradient exploding no so.
Wɔ soft thresholding function no mu no, sɛ yɛresiesie threshold no a, ɛsɛ sɛ ɛdi conditions abien akyi: nea edi kan, threshold no ɛsɛ sɛ ɛyɛ positive number; nea ɛtɔ so abien, threshold no nsua nsen input signal no maximum value, anyɛ saa a output no nyinaa bɛyɛ zero.
Afei nso, ɛyɛ sɛ threshold no di condition a ɛtɔ so abiɛsa akyi: sample biara nya ne threshold (own independent threshold) a egyina noise content a ɛwɔ mu no so.
Nea enti a ɛte saa ne sɛ noise content no taa yɛ soronko wɔ samples no mu. Ɛho nhwɛso ne sɛ, ɛtaa si wɔ dataset koro no ara mu sɛ Sample A kura noise kakraa bi na Sample B kura noise pii. Sɛ saa a, bere a yɛreyɛ soft thresholding wɔ denoising algorithm mu no, ɛsɛ sɛ Sample A nya threshold ketewaa, na Sample B anya threshold kɛse. Ɛwom sɛ saa features ne thresholds no nni physical definitions pɔtee wɔ deep neural networks mu de, nanso basic underlying logic no yɛ pɛ. Kyerɛ sɛ, sample biara ɛsɛ sɛ enya ne threshold a ɛda nso a egyina noise content pɔtee a ɛwɔ mu no so.
3. Attention Mechanism
Attention mechanisms no nnyɛ den sɛ yɛbɛte ase wɔ computer vision mu. Mmoa visual systems tumi scan area no nyinaa ntɛmntɛm de hu targets, na afei wɔde attention si target object no so de hu details pii, bere a wɔyi irrelevant information fi hɔ. Sɛ wopɛ details a, hwɛ literature a ɛfa attention mechanisms ho.
Squeeze-and-Excitation Network (SENet) yɛ deep learning method foforo a ɛde attention mechanisms di dwuma. Wɔ samples ahorow mu no, contribution a feature channels ahorow no de ma classification task no taa yɛ soronko. SENet fa small sub-network de nya Learn a set of weights, na afei ɛde saa weights no multiply features a ɛwɔ channels no mu de adjust magnitude ma features wɔ channel biara mu. Yɛbetumi afa wei sɛ Apply weighting to each feature channel.
Wɔ saa kwan yi so no, sample biara wɔ ne set of weights a ɛyɛ ne dea. Kyerɛ sɛ, weights ma samples abien biara yɛ soronko. Wɔ SENet mu no, kwan pɔtee a yɛfa so nya weights no ne “Global Pooling → Fully Connected Layer → ReLU Function → Fully Connected Layer → Sigmoid Function.”
4. Soft Thresholding with Deep Attention Mechanism
Deep Residual Shrinkage Network no fa adwene fi SENet sub-network structure a yɛaka ho asɛm no mu de yɛ soft thresholding wɔ deep attention mechanism ase. Denam sub-network (red box no) no so no, yɛbetumi Learn a set of thresholds de ayɛ soft thresholding ama feature channel biara.
Wɔ saa sub-network yi mu no, yɛkan absolute values ma features nyinaa wɔ input feature map no mu. Afei, denam global average pooling so no, yɛnya feature bi a yɛfrɛ no A. Wɔ path foforo no mu no, feature map a yɛayɛ no global average pooling no, yɛde kɔ small fully connected network mu. Saa fully connected network yi de Sigmoid function di dwuma sɛ final layer de normalize output no kɔ 0 ne 1 ntam, ma yɛnya coefficient bi a yɛfrɛ no α. Threshold no awiei koraa bɛyɛ α × A. Enti, threshold no yɛ number bi a ɛwɔ 0 ne 1 ntam × average ma absolute values ma feature map no. Saa kwan yi ma yɛn awerɛhyem sɛ threshold no yɛ positive na ɛnyɛ kɛse boro so.
Afei nso, samples ahorow no nya thresholds ahorow. Enti, yɛbetumi aka sɛ, wei yɛ specialized attention mechanism: ɛhu features a enhia (irrelevant) ma current task no, na ɛdan wɔn kɔ values a ɛbɛn zero denam convolutional layers abien so, na ɛde soft thresholding yɛ wɔn zero; anaasɛ, ɛhu features a ɛhia (relevant) ma current task no, na ɛdan wɔn kɔ values a ɛne zero nni hwee yɛ denam convolutional layers abien so, na ɛkora wɔn so.
Nea etwa to no, sɛ yɛ Stack many basic modules ne convolutional layers, batch normalization, activation functions, global average pooling, ne fully connected output layers a, na yɛanya Deep Residual Shrinkage Network no.
5. Generalization Capability
Deep Residual Shrinkage Network no, nokwarem no, ɛyɛ general feature learning method. Nea enti a ɛte saa ne sɛ, wɔ feature learning tasks pii mu no, samples no taa kura noise bi ne irrelevant information. Saa noise ne irrelevant information yi betumi aka feature learning performance no. Ɛho nhwɛso ne sɛ:
Wɔ image classification mu no, sɛ image bi kura objects foforo pii a, yɛbetumi afa saa objects no sɛ “noise.” Deep Residual Shrinkage Network no betumi de attention mechanism no ahu saa “noise” yi, na afei ɛde soft thresholding ayɛ features a ɛkɔ ma saa “noise” no zero, na wei betumi ama image classification accuracy no akɔ soro.
Wɔ speech recognition mu no, titiriw wɔ noisy environments te sɛ nkɔmmɔbɔ wɔ kwan ho anaa factory workshop mu no, Deep Residual Shrinkage Network no betumi ama speech recognition accuracy no ayɛ yiye, anaasɛ ɛbɛkyerɛ kwan a yɛbetumi afa so ama speech recognition accuracy no ayɛ yiye.
Reference
Minghang Zhao, Shisheng Zhong, Xuyun Fu, Baoping Tang, Michael Pecht, Deep residual shrinkage networks for fault diagnosis, IEEE Transactions on Industrial Informatics, 2020, 16(7): 4681-4690.
https://ieeexplore.ieee.org/document/8850096
BibTeX
@article{Zhao2020,
author = {Minghang Zhao and Shisheng Zhong and Xuyun Fu and Baoping Tang and Michael Pecht},
title = {Deep Residual Shrinkage Networks for Fault Diagnosis},
journal = {IEEE Transactions on Industrial Informatics},
year = {2020},
volume = {16},
number = {7},
pages = {4681-4690},
doi = {10.1109/TII.2019.2943898}
}
Academic Impact
Saa paper yi anya citations boro 1,400 wɔ Google Scholar so.
Sɛ yɛhwɛ statistics no a, wɔde Deep Residual Shrinkage Network (DRSN) adi dwuma tẽẽ anaasɛ wɔayɛ nsakrae de adi dwuma wɔ publications/studies boro 1,000 mu wɔ fields pii te sɛ mechanical engineering, electrical power, vision, healthcare, speech, text, radar, ne remote sensing.