Deep Residual Shrinkage Network la, enye nɔnɔme yeye si wotrɔ asi le Deep Residual Network ŋu wònyo ɖe edzi. Le nyateƒe me la, Deep Residual Shrinkage Network la tsaka nu vevi etɔ̃: Deep Residual Network, Attention mechanisms, kple Soft thresholding functions.
Míate ŋu ase ale si Deep Residual Shrinkage Network la wɔa dɔe le mɔ sia nu. Gbã la, network la zãa attention mechanisms tsɔ dea dzesi features siwo mele vevie o. Eyome, network la zãa soft thresholding functions tsɔ trɔa feature manyomanyo siawo wozua zero. Gake le go bubu me la, network la dea dzesi features siwo le vevie la, eye wòléa wo ɖe te. Mɔnu sia doa ŋusẽ deep neural network la ƒe ŋutete. Ekpena ɖe network la ŋu be wòate ŋu a-extract features siwo ŋu viɖe le tso signals siwo me noise le la me.
1. Nusita Wodze Dɔ Sia Gɔme (Research Motivation)
Gbã la, ne algorithm la le samples classify-m la, noise li godoo, womate ŋu aƒo asa nɛ o. Kpɔɖeŋu na noise siawo ƒe ɖewoe nye Gaussian noise, pink noise, kple Laplacian noise. Ne míagblɔe le mɔ gbadza nu la, zi geɖe la, samples la kuraa information siwo mehiã na classification task si wɔm míele la o. Míate ŋu abubu information manyomanyo siawo be wonye noise. Noise siawo ate ŋu ana classification performance la naɖiɖi. (Soft thresholding nye afɔɖeɖe vevi aɖe le signal denoising algorithms geɖe me.)
Le kpɔɖeŋu me, bu dzeɖoɖo le mɔ to ŋu kpɔ. Audio la ate ŋu axɔ ʋu ƒe kpeɖiɖi kple afɔkpaɖiɖi. Ðewohĩ míedi be míawɔ speech recognition le signals siawo ŋu. Gake background sounds siawo agblẽ nu le result la ŋu. Tso deep learning ƒe nukpɔkpɔ me la, ele be deep neural network la naɖe features siwo kuku ɖe ʋu ƒe kpeɖiɖi kple afɔkpaɖiɖi ŋu la ɖa. Nuɖeɖe sia xe mɔ na features la be womegagblẽ speech recognition la ƒe results o.
Evelia, noise ƒe agbɔsɔsɔ le sample ɖe sia ɖe me to vovo. Vovototo sia dzɔna le dataset ɖeka ma ke gɔ̃ hã me. (Vovototo sia sɔ kple ale si attention mechanisms wɔa dɔe. Tsɔ image dataset aɖe abe kpɔɖeŋu ene. Teƒe si target object la le le nɔnɔmetatawo me ate ŋu ato vovo. Attention mechanisms ate ŋu alé ŋku ɖe teƒe koŋ si target object la le le image ɖe sia ɖe me.)
Míatsɔe be míele dadakpui kple avu ƒe classifier train-m, eye miedzra images 5 ɖo siwo ŋu woŋlɔ “dog” (avu) ɖo.
- Image 1 ate ŋu anye avu kple mɔ (mouse).
- Image 2 ate ŋu anye avu kple daba (goose).
- Image 3 ate ŋu anye avu kple koklo (chicken).
- Image 4 ate ŋu anye avu kple tedzi (donkey).
- Image 5 ate ŋu anye avu kple kpakpa (duck).
Le training la me la, nu bubu siawo siwo mehiã o la, woa-interfere kple classifier la. Nu siawoe nye mɔ, daba, koklo, tedzi, kple kpakpa. Esia ana classification accuracy la naɖiɖi. Ne míate ŋu adze si nu bubu siawo la, ekema míate ŋu a-eliminate features siwo kuku ɖe wo ŋu. To mɔ sia dzi la, míate ŋu awɔe be dadakpui kple avu ƒe classifier la ƒe accuracy nadzi ɖe edzi.
2. Soft Thresholding
Soft thresholding nye afɔɖeɖe vevi aɖe le signal denoising algorithms geɖe me. Ne features la ƒe absolute values le sɔsɔe wu threshold aɖe la, algorithm la ɖea feature mawo ɖa. Ke ne features la ƒe absolute values lolo wu threshold sia la, algorithm la “shrinks” (miãa) features la woyia zero gbɔ. Researchers ate ŋu azã formula si gbɔna tsɔ awɔ soft thresholding:
\[y = \begin{cases} x - \tau & x > \tau \\ 0 & -\tau \le x \le \tau \\ x + \tau & x < -\tau \end{cases}\]Soft thresholding ƒe derivative le input la ŋu enye:
\[\frac{\partial y}{\partial x} = \begin{cases} 1 & x > \tau \\ 0 & -\tau \le x \le \tau \\ 1 & x < -\tau \end{cases}\]Formula si le etame la fia be soft thresholding ƒe derivative nye 1 alo 0. Nɔnɔme sia sɔ pɛpɛpɛ kple ReLU activation function tɔ. Eyata, soft thresholding ate ŋu aɖe afɔku si nye gradient vanishing kple gradient exploding dzi akpɔtɔ le deep learning algorithms me.
Le soft thresholding function la me la, ele be threshold la nasɔ kple nɔnɔme eve aɖewo. Gbã la, ele be threshold la nanye xexlẽme si nye positive number. Evelia, threshold la mefo input signal la ƒe maximum value ta o. Ne menye nenema o la, output la katã azu zero.
Hekpe ɖe eŋu la, anyo ŋutɔ be threshold la nawɔ ɖe nɔnɔme etɔ̃lia dzi. Eyae nye be, ele be sample ɖe sia ɖe nakpɔ eya ŋutɔ ƒe threshold si nɔ te ɖe noise ƒe agbɔsɔsɔ si le sample ma me dzi.
Nusita wòle be wòanɔ alea enye be, noise ƒe agbɔsɔsɔ to vovo le samples me. Le kpɔɖeŋu me, Sample A ate ŋu axɔ noise sue aɖe, gake Sample B axɔ noise geɖe le dataset ɖeka ma ke me. Le nɔnɔme sia me la, Sample A nazã threshold sue aɖe le soft thresholding wɔɣi. Ke Sample B ya nazã threshold gã aɖe. Togbɔ be features kple thresholds siawo bua woƒe gɔmesese ŋutɔŋutɔ (physical definition) le deep neural networks me hã la, logic si le ete la metrɔ o. Ne míagblɔe le mɔ bubu nu la, ele be sample ɖe sia ɖe naxɔ independent threshold. Noise ƒe agbɔsɔsɔ koŋ ye fia threshold si wòazã.
3. Attention Mechanism
Researchers ate ŋu ase attention mechanisms gɔme bɔbɔe le computer vision gome. Ale si lãwo kple amegbetɔwo kpɔa nui la, míate ŋu adze si nuwo ne míetsɔ ŋku glo teƒe la kabakaba. Eyome, ŋkua léa ŋku ɖe (focuses attention) target object la ŋu. Dɔwɔna sia na be míetea ŋu kpɔa details geɖe. Le ɣeyiɣi ma ke me la, míeɖea ŋku ɖa le information siwo mehiã o la ŋu (suppress irrelevant information). Ne èdi be yeanya nu geɖe tso eŋu la, àte ŋu axlẽ agbalẽ siwo ƒo nu tso attention mechanisms ŋu.
Squeeze-and-Excitation Network (SENet) nye deep learning mɔnu yeye aɖe si zãa attention mechanisms. Le samples vovovowo me la, feature channels vovovowo wɔa dɔ le mɔ vovovo nu na classification task la. SENet zãa sub-network sue aɖe tsɔ kpɔa a set of weights (kpekpeme ƒe ƒuƒoƒo aɖe). Eyome, SENet mɔtiplaia weights siawo kple features siwo le channels la me. Dɔwɔna sia trɔa features la ƒe lolome le channel ɖe sia ɖe me. Míate ŋu abui be process sia le abe Apply weighting to each feature channel (dada kpekpeme na feature channel ɖe sia ɖe) ene, si fia be wole attention dem channel ɖe sia ɖe me le mɔ vovovo nu.
Le mɔnu sia me la, independent set of weights le sample ɖe sia ɖe si. Esia fia be, weights siwo le sample eve aɖe sia aɖe si la to vovo. Le SENet me la, mɔ si dzi wotona kpɔa weights lae nye “Global Pooling → Fully Connected Layer → ReLU Function → Fully Connected Layer → Sigmoid Function.”
4. Soft Thresholding with Deep Attention Mechanism
Deep Residual Shrinkage Network la zãa SENet sub-network ƒe structure. Network la zãa structure sia tsɔ wɔa soft thresholding le deep attention mechanism te. Sub-network la (si le aɖaka dzĩ la me) Learns a set of thresholds (srɔ̃a thresholds ƒe ƒuƒoƒo aɖe). Eyome, network la zãa thresholds siawo tsɔ wɔa soft thresholding na feature channel ɖe sia ɖe.
Le sub-network sia me la, gbã la, system la bua absolute values na features siwo katã le input feature map la me. Eyome, system la wɔa global average pooling hedoa average nɛ be wòakpɔ feature aɖe, si míanɔ yɔyɔm be A. Le mɔ evelia dzi la, system la tsɔa feature map la dea fully connected network sue aɖe me le global average pooling yome. Fully connected network sia zãa Sigmoid function abe layer mamlɛtɔ ene. Function sia na be output la nɔa 0 kple 1 dome. Process sia na míekpɔa coefficient aɖe, si míanɔ yɔyɔm be α. Míate ŋu aŋlɔ final threshold la be α × A. Eyata, threshold la nye xexlẽme eve ƒe product. Xexlẽme ɖeka le 0 kple 1 dome. Xexlẽme evelia nye average na absolute values siwo le feature map la me. Mɔnu sia na be threshold la nyea positive ɣesiaɣi. Mɔnu sia hana be threshold la melolo fũ akpa o.
Hekpe ɖe eŋu la, samples vovovowo naa thresholds vovovowo. Eyata, míate ŋu ase mɔnu sia gɔme be enye attention mechanism tɔxɛ aɖe. Mechanism la dea dzesi features siwo meku ɖe dɔ (task) si wɔm míele fifia ŋu o. Mechanism la trɔa features siawo wozua values siwo te ɖe zero ŋu to convolutional layers eve dzi. Eyome, mechanism la zãa soft thresholding tsɔ trɔa features siawo wozua zero. Alo, mechanism la dea dzesi features siwo ku ɖe dɔ si wɔm míele fifia ŋu. Mechanism la trɔa features siawo wozua values siwo te ɖa xaa tso zero gbɔ to convolutional layers eve dzi. Mlɛba la, mechanism la léa features siawo ɖe te.
Mamlɛtɔ la, míewɔa Stack many basic modules (míedoa basic modules geɖe ɖe wo nɔewo dzi). Míedoa convolutional layers, batch normalization, activation functions, global average pooling, kple fully connected output layers hã ɖe eme. Process siae tuna Deep Residual Shrinkage Network blibo la.
5. Generalization Capability
Deep Residual Shrinkage Network nye mɔnu si míate ŋu azã na feature learning dɔwo gbadzaa. Nusitae nye be, le feature learning tasks geɖe me la, noise nɔa samples la me zi geɖe. Irrelevant information (nyatakaka siwo mehiã o) hã nɔa samples la me. Noise kple irrelevant information siawo ate ŋu agblẽ nu le feature learning ƒe performance ŋu. Le kpɔɖeŋu me:
Bu image classification ŋu kpɔ. Nɔnɔmetata (image) ɖeka ate ŋu axɔ nu bubu geɖe ɖe eme. Míate ŋu abu nu bubu siawo be wonye “noise.” Deep Residual Shrinkage Network la ate ŋu azã attention mechanism. Network la dea dzesi “noise” siawo. Eyome, network la zãa soft thresholding tsɔ trɔa features siwo sɔ kple “noise” siawo wozua zero. Dɔwɔna sia ate ŋu ana image classification accuracy la nanyo ɖe edzi.
Bu speech recognition ŋu kpɔ. Vevietɔ, bu teƒe siwo ŋegbe le ŋu, abe dzeɖoɖo le mɔ to alo le dɔwɔƒe (factory workshop) ene. Deep Residual Shrinkage Network ate ŋu ana speech recognition accuracy nanyo ɖe edzi. Alo le eɖeɖe kwasi la, network la fia mɔnu (methodology) aɖe. Mɔnu sia ate ŋu ana speech recognition accuracy nanyo ɖe edzi.
Reference
Minghang Zhao, Shisheng Zhong, Xuyun Fu, Baoping Tang, Michael Pecht, Deep residual shrinkage networks for fault diagnosis, IEEE Transactions on Industrial Informatics, 2020, 16(7): 4681-4690.
https://ieeexplore.ieee.org/document/8850096
BibTeX
@article{Zhao2020,
author = {Minghang Zhao and Shisheng Zhong and Xuyun Fu and Baoping Tang and Michael Pecht},
title = {Deep Residual Shrinkage Networks for Fault Diagnosis},
journal = {IEEE Transactions on Industrial Informatics},
year = {2020},
volume = {16},
number = {7},
pages = {4681-4690},
doi = {10.1109/TII.2019.2943898}
}
Academic Impact (Ŋusẽ Si Wòkpɔ Ðe Agbalẽsrɔ̃nyawo Dzi)
Paper sia xɔ citations wu 1,400 le Google Scholar dzi.
Le akɔnta siwo míekpɔ nu la, researchers zã Deep Residual Shrinkage Network (DRSN) le publications/studies siwo wu 1,000 me. Dɔ siawo ku ɖe gome vovovowo ŋu. Gome siawo ƒe ɖewoe nye mechanical engineering, electrical power, vision, healthcare, speech, text, radar, kple remote sensing.