I-Deep Residual Shrinkage Network yinguqulo ethuthukisiwe ye-Deep Residual Network. Kahle kahle, ikuhlanganiswa kwe-Deep Residual Network, ama-attention mechanisms, kanye nama-soft thresholding functions.
Ngokwezinga elithile, indlela yokusebenza ye-Deep Residual Shrinkage Network ingaqondwa kanje: isebenzisa ama-attention mechanisms ukubona ama-features angabalulekile bese isebenzisa i-soft thresholding ukuwabeka ku-zero; ngakolunye uhlangothi, ibona ama-features abalulekile bese iyawagcina. Le nqubo ithuthukisa ikhono le-deep neural network lokukhipha ama-features awusizo kumasignals aqukethe i-noise.
1. Research Motivation (Isizathu Socwaningo)
Okokuqala, uma sihlukanisa (classify) ama-samples, ukuba khona kwe-noise—njenge-Gaussian noise, i-pink noise, ne-Laplacian noise—kuyinto engenakugwenywa. Ngokubanzi, ama-samples avame ukuba nolwazi olungahlobene ne-classification task yamanje, nalo elingathathwa njenge-noise. Le noise ingaba nomthelela omubi kwi-classification performance. (I-Soft thresholding yisinyathelo esibalulekile kuma-algorithm amaningi we-signal denoising.)
Isibonelo, ngesikhathi sengxoxo eceleni komgwaqo, umsindo ungase ube neziqhume zezimoto namasondo. Uma wenza i-speech recognition kulezi zimpawu (signals), imiphumela nakanjani izothinteka ngenxa yale misindo yangemuva. Ngokombono we-deep learning, ama-features ahambisana nezimpondo namasondo kumele asuswe ngaphakathi kwi-deep neural network ukuvimbela ukuthi angaphazamisi imiphumela ye-speech recognition.
Okwesibili, ngisho nakwi-dataset efanayo, inani le-noise lihluka kwi-sample ne-sample. (Lokhu kunokufana nama-attention mechanisms; uma sithatha i-image dataset njengesibonelo, indawo ye-target object ingahluka ezithombeni, futhi ama-attention mechanisms angagxila endaweni ethile ye-target object esithombeni ngasinye.)
Isibonelo, uma siqeqesha (training) i-classifier yekati nenja, cabanga ngezithombe ezinhlanu ezine-label ethi “inja.” Isithombe sokuqala singaba nenja negundane, esesibili inja nehansi, esesithathu inja nenkukhu, esesine inja nembongolo, kanti esesihlanu inja nedada. Ngesikhathi se-training, i-classifier nakanjani izophazamiseka ngenxa yezinto ezingahlobene njengamagundane, amahansi, izinkukhu, izimbongolo, namadada, okungaholela ekwehleni kwe-classification accuracy. Uma sikwazi ukubona lezi zinto ezingabalulekile—amagundane, amahansi, izinkukhu, izimbongolo, namadada—bese sisusa ama-features azo ahambisanayo, kungenzeka sithuthukise i-accuracy ye-classifier yekati nenja.
2. Soft Thresholding
I-Soft thresholding yisinyathelo esibalulekile kuma-algorithm amaningi we-signal denoising. Isusa ama-features anama-absolute values aphansi kune-threshold ethile bese inciphisa (shrinks) ama-features anama-absolute values aphezulu kune-threshold abheke ku-zero. Ingenziwa kusetshenziswa le formula elandelayo:
\[y = \begin{cases} x - \tau & x > \tau \\ 0 & -\tau \le x \le \tau \\ x + \tau & x < -\tau \end{cases}\]I-derivative ye-soft thresholding output mayelana ne-input imi kanje:
\[\frac{\partial y}{\partial x} = \begin{cases} 1 & x > \tau \\ 0 & -\tau \le x \le \tau \\ 1 & x < -\tau \end{cases}\]Njengoba kubonisiwe ngenhla, i-derivative ye-soft thresholding ingu-1 noma 0. Lesi sakhiwo siyafana nese-ReLU activation function. Ngakho-ke, i-soft thresholding ingaphinda inciphise ubungozi bokuthi ama-deep learning algorithms ahlangabezane ne-gradient vanishing kanye ne-gradient exploding.
Kwi-soft thresholding function, ukusethwa kwe-threshold kumele ihlangabezane nemibandela emibili: okokuqala, i-threshold kumele ibe yinombolo ephozitivu (positive number); okwesibili, i-threshold ayikwazi ukuba nkulu kune-maximum value ye-input signal, kungenjalo i-output izoba ngu-zero yonke.
Ngaphezu kwalokho, kungcono ukuthi i-threshold ihlangabezane nombandela wesithathu: i-sample ngayinye kumele ibe ne-threshold yayo ezimele ngokusekelwe kwi-noise content yayo.
Isizathu salokhu ukuthi i-noise content ivame ukuhluka phakathi kwama-samples. Isibonelo, kujwayelekile kwi-dataset efanayo ukuthi i-Sample A ibe ne-noise encane kanti i-Sample B ibe ne-noise eningi. Kulokhu, uma sisebenzisa i-soft thresholding kwi-denoising algorithm, i-Sample A kumele isebenzise i-threshold encane, kanti i-Sample B kumele isebenzise i-threshold enkulu. Nakuba lawa ma-features nama-thresholds elahlekelwa izincazelo zawo eziqondile ze-physics ngaphakathi kuma-deep neural networks, i-logic eyisisekelo isafana. Ngamanye amazwi, i-sample ngayinye kumele ibe ne-threshold yayo ezimele enqunywa yi-noise content yayo ethile.
3. Attention Mechanism
Ama-attention mechanisms kulula ukuwaqonda emkhakheni we-computer vision. Ama-visual systems ezilwane angakwazi ukuhlukanisa ama-targets ngokuhlola (scanning) yonke indawo ngokushesha, bese egxilisa i-attention kwi-target object ukukhipha imininingwane eminingi ngenkathi ecindezela ulwazi olungabalulekile. Ukuze uthole imininingwane, sicela ubheke imibhalo (literature) emayelana nama-attention mechanisms.
I-Squeeze-and-Excitation Network (SENet) imele indlela entsha ye-deep learning esebenzisa ama-attention mechanisms. Kuma-samples ahlukene, igalelo lama-feature channels ahlukene kwi-classification task liyahluka. I-SENet isebenzisa i-sub-network encane ukuthola i-Learn a set of weights, bese iphindaphinda lawa ma-weights ngama-features ama-channels ahambisanayo ukulungisa ubukhulu bama-features kwi-channel ngayinye. Le nqubo ingathathwa njenge-Apply weighting to each feature channel.
Kulendlela, i-sample ngayinye inesethi yayo ezimele yama-weights. Ngamanye amazwi, ama-weights wanoma yimaphi ama-samples amabili ahlukile. Kwi-SENet, indlela ethile yokuthola ama-weights i-“Global Pooling → Fully Connected Layer → ReLU Function → Fully Connected Layer → Sigmoid Function.”
4. Soft Thresholding with Deep Attention Mechanism
I-Deep Residual Shrinkage Network ithola ugqozi kwi-structure se-sub-network ye-SENet ebalulwe ngenhla ukusebenzisa i-soft thresholding ngaphansi kwe-deep attention mechanism. Ngokusebenzisa i-sub-network (eboniswe ebhokisini elibomvu), kungafundwa i-Learn a set of thresholds ukusebenzisa i-soft thresholding kwi-feature channel ngayinye.
Kule sub-network, ama-absolute values awo wonke ama-features kwi-input feature map ayabalwa kuqala. Bese, ngokusebenzisa i-global average pooling (GAP) kanye ne-averaging, kutholakala i-feature, ebizwa ngo-A. Kwenye i-path, i-feature map ngemuva kwe-global average pooling ifakwa kwi-fully connected network encane. Le fully connected network isebenzisa i-Sigmoid function njenge-layer yayo yokugcina ukwenza i-output ibe phakathi kuka-0 no-1, okukhipha i-coefficient ebizwa ngo-α. I-threshold yokugcina ingabhalwa njengo-α × A. Ngakho-ke, i-threshold iwumkhiqizo wenombolo ephakathi kuka-0 no-1 kanye ne-average yama-absolute values we-feature map. Le ndlela iqinisekisa ukuthi i-threshold ayigcini nje ngokuba phozitivu kodwa futhi ayibi nkulu kakhulu.
Ngaphezu kwalokho, ama-samples ahlukene aholela kuma-thresholds ahlukene. Ngenxa yalokho, ngokwezinga elithile, lokhu kungathathwa njenge-attention mechanism ekhethekile: ibona ama-features angahlobene ne-task yamanje, iwaguqule abe amanani aseduze no-zero ngama-convolutional layers amabili, bese iwabeka ku-zero isebenzisa i-soft thresholding; noma, ibona ama-features ahlobene ne-task yamanje, iwaguqule abe amanani akude no-zero ngama-convolutional layers amabili, bese iyawagcina.
Ekugcineni, ngokupakisha (stacking) isibalo esithile sama-basic modules kanye nama-convolutional layers, batch normalization, activation functions, global average pooling, nama-fully connected output layers, kwakhiwa i-Deep Residual Shrinkage Network ephelele (Stack many basic modules).
5. Generalization Capability (Ikhono Lokujwayela)
I-Deep Residual Shrinkage Network, empeleni, yindlela ejwayelekile ye-feature learning. Lokhu kungenxa yokuthi, emisebenzini eminingi ye-feature learning, ama-samples aqukethe i-noise ethile kanye nolwazi olungabalulekile. Le noise nolwazi olungabalulekile kungaba nomthelela ekusebenzeni kwe-feature learning. Isibonelo:
Kwi-image classification, uma isithombe siqukethe ezinye izinto eziningi ngesikhathi esisodwa, lezi zinto zingaqondwa njenge-“noise.” I-Deep Residual Shrinkage Network ingakwazi ukusebenzisa i-attention mechanism ukubona le “noise” bese isebenzisa i-soft thresholding ukubeka ama-features ahambisana nale “noise” ku-zero, okungathuthukisa i-accuracy ye-image classification.
Kwi-speech recognition, ikakhulukazi ezindaweni ezine-noise njengengxoxo eceleni komgwaqo noma ngaphakathi kwi-workshop yefektri, i-Deep Residual Shrinkage Network ingathuthukisa i-accuracy ye-speech recognition, noma okungenani, inikeze indlela (methodology) ekwazi ukuthuthukisa i-accuracy ye-speech recognition.
Reference
Minghang Zhao, Shisheng Zhong, Xuyun Fu, Baoping Tang, Michael Pecht, Deep residual shrinkage networks for fault diagnosis, IEEE Transactions on Industrial Informatics, 2020, 16(7): 4681-4690.
https://ieeexplore.ieee.org/document/8850096
BibTeX
@article{Zhao2020,
author = {Minghang Zhao and Shisheng Zhong and Xuyun Fu and Baoping Tang and Michael Pecht},
title = {Deep Residual Shrinkage Networks for Fault Diagnosis},
journal = {IEEE Transactions on Industrial Informatics},
year = {2020},
volume = {16},
number = {7},
pages = {4681-4690},
doi = {10.1109/TII.2019.2943898}
}
Academic Impact (Umthelela Kwezemfundo)
Leli phepha (paper) lithole ama-citations angaphezu kuka-1400 ku-Google Scholar.
Ngokwezibalo ezingaphelele, i-Deep Residual Shrinkage Network (DRSN) isisetshenziswe ngqo noma yalungiswa futhi yasetshenziswa emibhalweni/ocwaningweni olungaphezu kuka-1000 emikhakheni eminingi, kubandakanya i-mechanical engineering, electrical power, vision, healthcare, speech, text, radar, kanye ne-remote sensing.