Deep Residual Shrinkage Network ye “improved variant” ya Deep Residual Network. Mu butuufu, ye “integration” ya Deep Residual Network, attention mechanisms, ne soft thresholding functions.
Mu ngeri emu, “working principle” ya Deep Residual Shrinkage Network eyinza okutegeerebwa bweti: ekozesa “attention mechanisms” okuzuula “unimportant features” n’ekozesa “soft thresholding functions” okubifuula zero; ate ku ludda olulala, ezuula “important features” n’ebisigaza. Enkola eno yongera ku maanyi ga “deep neural network” okuggyayo “useful features” okuva mu “signals” ezirimu “noise”.
1. Research Motivation
Okusooka, bwe tuba nga tukola “classifying” wa “samples”, “noise”—nga Gaussian noise, pink noise, ne Laplacian noise—tebyewalika. Mu ngeri egazi, “samples” zitera okubaamu amawulire agatakwatagana na “classification task” eriwo, era gano nago gayinza okuyitibwa “noise”. “Noise” eno eyinza okukosa “classification performance”. (“Soft thresholding” gwe mutendera omukulu mu “signal denoising algorithms” nnyingi.)
Okugeza, nga munyumya emboozi ku mabbali g’oluguudo, eddoboozi liyinza okutabikamu amaloboozi g’emmotoka ne “horns”. Bwe tuba tukola speech recognition ku “signals” zino, “results” zijja kukosebwa olw’amaloboozi ago ag’emabega. Mu kulaba kwa deep learning, “features” ezikwatagana ne “horns” n’emipiira gy’emmotoka ziteekwa okuggyibwamu (eliminated) mu deep neural network okuziyiza obutataataaganya “speech recognition results”.
Ekyokubiri, ne mu “dataset” y’emu, obungi bwa “noise” butera okwawukana mu “sample” emu n’endala. (Kino kigendera wamu ne “attention mechanisms”; okugeza mu “image dataset”, ekifo “target object” w’eri kiyinza okwawukana mu “images” ez’enjawulo, era “attention mechanisms” zisobola okuteeka essira ku kifo ekyo wennyini).
Okugeza, nga tutendeka cat-and-dog classifier, lowooza ku “images” ttaano eziriko “label” ya “dog” (embwa). “Image” esooka eyinza okubaamu embwa n’emmese, eyookubiri embwa n’embaata, eyookusatu embwa n’enkoko, eyookuna embwa n’endogoyi, ate eyookutaano embwa n’embaata. Mu kiseera kya “training”, “classifier” ejja kutaataaganyizibwa ebintu ebyo ebitali bya mugaso nga emmese, embaata, enkoko, endogoyi, ne ssekkokko, ekivaamu “classification accuracy” okukka. Bwe tuba nga tusobola okuzuula ebintu bino ebitali bya mugaso—emmese, embaata, enkoko, endogoyi, ne ssekkokko—ne tuggyamu “features” zaabyo, kisoboka okwongera ku “accuracy” ya cat-and-dog classifier.
2. Soft Thresholding
“Soft thresholding” gwe mutendera omukulu (core step) mu “signal denoising algorithms” nnyingi. Eziyako “features” ezirina “absolute values” eziri wansi wa “threshold” ne zisika “features” ezirina “absolute values” eziri waggulu wa “threshold” okudda okumpi ne zero. Ekyo kiyinza okukolebwa nga tukozesa formula eno:
\[y = \begin{cases} x - \tau & x > \tau \\ 0 & -\tau \le x \le \tau \\ x + \tau & x < -\tau \end{cases}\]Derivative ya soft thresholding output ku input eri:
\[\frac{\partial y}{\partial x} = \begin{cases} 1 & x > \tau \\ 0 & -\tau \le x \le \tau \\ 1 & x < -\tau \end{cases}\]Nga bwe kirabika waggulu, derivative ya soft thresholding eri 1 oba 0. Kino kifaanagana ne ReLU activation function. N’olwekyo, soft thresholding eyamba deep learning algorithms obutasanga buzibu bwa gradient vanishing ne gradient exploding.
Mu “soft thresholding function”, okuteekawo “threshold” kulina okutuukiriza ebisaanyizo bibiri: ekisooka, “threshold” erina okuba “positive number”; ekyokubiri, “threshold” tesobola kusukka “maximum value” ya “input signal”, bwe kiba bwe kityo “output” yonna ejja kuba zero.
N’ekirala, kyandibadde kirungi “threshold” n’etuukiriza ekisaanyizo ekyokusatu: buli “sample” erina okuba ne “independent threshold” yayo nga yesigamiziddwa ku “noise content” yayo.
Kino kiri bwe kityo kubanga “noise content” etera okwawukana mu “samples”. Okugeza, kitera okubaawo mu “dataset” y’emu nti Sample A erimu “noise” mutono ate Sample B erimu “noise” mungi. Mu mbeera eno, bwe tuba tukola soft thresholding mu denoising algorithm, Sample A yandibadde ekozesa “threshold” entono, ate Sample B n’ekozesa “threshold” ennene. Wadde nga “features” zino ne “thresholds” ziba tezikyalina “physical definitions” mu deep neural networks, logiki y’emu. Mu bigambo birala, buli “sample” erina okuba ne “independent threshold” eyeesigamye ku “noise content” yaayo yennyini.
3. Attention Mechanism
Attention mechanisms zangu okutegeera mu computer vision. Enkola y’okulaba (visual systems) mu bisolo esobola okwawula “targets” nga eyaza (scan) ekifo kyonna mangu, oluvannyuma n’eteeka “attention” ku “target object” okuggyayo “details” endala, ate ng’ezikiza (suppressing) amawulire agitali ga mugaso. Okumanya ebisingawo, soma ku attention mechanisms.
Squeeze-and-Excitation Network (SENet) ye nkola ya deep learning empya ekozesa attention mechanisms. Mu “samples” ez’enjawulo, omugaso gwa “feature channels” ez’enjawulo mu “classification task” gutera okwawukana. SENet ekozesa “sub-network” entono okufuna Learn a set of weights (seti ya weights), oluvannyuma n’ekubisa “weights” zino mu “features” za “channels” ezo okukyusa obunene bwa “features” mu buli “channel”. Enkola eno eyinza okutwalibwa nga Apply weighting to each feature channel (okuteeka attention ey’enjawulo ku buli feature channel).
Mu nkola eno, buli “sample” erina seti ya “weights” yayo yenyini. Mu bigambo birala, “weights” za “samples” zonna zaawukana. Mu SENet, ekkubo ly’okufunamu “weights” liri: “Global Pooling → Fully Connected Layer → ReLU Function → Fully Connected Layer → Sigmoid Function.”
4. Soft Thresholding with Deep Attention Mechanism
Deep Residual Shrinkage Network ekozesa ebyo ebyogeddwako waggulu ku SENet sub-network structure okukola soft thresholding nga erimu deep attention mechanism. Okuyita mu “sub-network” (eragiddwa mu red box), Learn a set of thresholds (seti ya thresholds eyinza okuyigibwa) okukola soft thresholding ku buli “feature channel”.
Mu “sub-network” eno, “absolute values” za “features” zonna mu “input feature map” zisookebwa kubalibwa. Oluvannyuma, okuyita mu “global average pooling” ne “averaging”, tufuna “feature” emu, eyitibwa A. Ku ludda olulala (okuyita mu Identity path), “feature map” eno eyingizibwa mu “fully connected network” entono. “Fully connected network” eno ekozesa Sigmoid function nga “layer” esembayo okufuula “output” okuba wakati wa 0 ne 1, ekivaamu “coefficient” eyitibwa α. “Threshold” esembayo eyinza okulagibwa nga α×A. N’olwekyo, “threshold” gwe muwendo oguli wakati wa 0 ne 1 ogukubisiddwaamu “average” ya “absolute values” za “feature map”. Enkola eno ekakasa nti “threshold” eba “positive” ate nga si nnene nnyo.
Okugatta ku ekyo, “samples” ez’enjawulo zivaamu “thresholds” za njawulo. N’olwekyo, mu ngeri emu, kino kiyinza okutegeerebwa nga “specialized attention mechanism”: ezuula “features” ezitakwatagana na mulimu guliwo, n’ezifuula emiwendo egiri okumpi ne zero okuyita mu “convolutional layers” bbiri, oluvannyuma n’ezifuula zero ng’ekozesa “soft thresholding”; oba, ezuula “features” ezikwatagana n’omulimu oguliwo, n’ezifuula emiwendo egiri ewala okuva ku zero, n’ezisigaza (preserves them).
Ekisembayo, nga tukola Stack many basic modules (okuteeka modules nnyingi) wamu ne “convolutional layers”, “batch normalization”, “activation functions”, “global average pooling”, ne “fully connected output layers”, Deep Residual Shrinkage Network enjijuvu eba ezimbiddwa.
5. Generalization Capability
Deep Residual Shrinkage Network, mu butuufu, ye general feature learning method. Kino kiri bwe kityo kubanga, mu mirimu mingi egya feature learning, “samples” zitera okubaamu noise oba amawulire agatakwatagana na mulimu. Noise eno n’amawulire ago biyinza okukosa “performance” ya feature learning. Okugeza:
Mu image classification, bw’eba “image” erimu ebintu ebirala bingi, ebintu bino biyinza okutwalibwa nga noise. Deep Residual Shrinkage Network eyinza okusobola okukozesa attention mechanism okwetegereza noise eno, oluvannyuma n’ekozesa soft thresholding okufuula “features” za noise eno okuba zero, ekisobola okwongera ku image classification accuracy.
Mu speech recognition, naddala mu bifo ebirimu noise mungi nga ku mabbali g’oluguudo oba mu kkolero (factory workshop), Deep Residual Shrinkage Network eyinza okwongera ku speech recognition accuracy, oba okuleeta enkola esobola okwongera ku speech recognition accuracy.
Reference
Minghang Zhao, Shisheng Zhong, Xuyun Fu, Baoping Tang, Michael Pecht, Deep residual shrinkage networks for fault diagnosis, IEEE Transactions on Industrial Informatics, 2020, 16(7): 4681-4690.
https://ieeexplore.ieee.org/document/8850096
BibTeX
@article{Zhao2020,
author = {Minghang Zhao and Shisheng Zhong and Xuyun Fu and Baoping Tang and Michael Pecht},
title = {Deep Residual Shrinkage Networks for Fault Diagnosis},
journal = {IEEE Transactions on Industrial Informatics},
year = {2020},
volume = {16},
number = {7},
pages = {4681-4690},
doi = {10.1109/TII.2019.2943898}
}
Academic Impact
Olupapula luno (paper) luwandiikiddwako (cited) emirundi egisukka mu 1,400 ku Google Scholar.
Okusinziira ku bibalo ebitannaggwaayo, Deep Residual Shrinkage Network (DRSN) ekozeseddwa butereevu oba okukyusibwamu n’ekozesebwa mu “publications/studies” ezisukka mu 1,000 mu bintu eby’enjawulo omuli mechanical engineering, electrical power, vision, healthcare, speech, text, radar, ne remote sensing.