Deep Residual Shrinkage Network (DRSN) ndi mtundu wina wopangidwa bwino wa Deep Residual Network. Kwenikweni, ndi kuphatikiza kwa Deep Residual Network, attention mechanisms, ndi soft thresholding functions.
Mwakutero, kagwiridwe ntchito ka Deep Residual Shrinkage Network kamamveka motere: imagwiritsa ntchito attention mechanisms kuti izindikire features zomwe sizofunika, ndipo imagwiritsa ntchito soft thresholding functions kuti izisinthe kukhala zero; komanso, imazindikira features zofunika ndikuzisunga. Izi zimathandiza deep neural network kutulutsa useful features kuchokera ku signals zomwe zili ndi noise.
1. Research Motivation
Choyamba, pamene tikuchita classifying samples, kukhalapo kwa noise—monga Gaussian noise, pink noise, ndi Laplacian noise—n’kosalephereka. M’lingaliro lalikulu, ma samples nthawi zambiri amakhala ndi zinthu (information) zomwe sizikugwirizana ndi ntchito ya classification yomwe ikuchitikayo, ndipo zinthu izi zingatengedwe ngati noise. Noise imeneyi ingasokoneze mmene classification ikugwirira ntchito. (Soft thresholding ndi gawo lalikulu mu ma algorithms ambiri a signal denoising.)
Mwachitsanzo, pakukambirana m’mbali mwa msewu, mawu a anthu akhoza kusakanikirana ndi phokoso la magalimoto (monga car horns ndi mawilo). Pamene tikuchita speech recognition pa signals izi, zotsatira zake zidzasokonezedwa ndi phokoso lakumbuyolo. Kuchokera ku maganizo a deep learning, ma features okhudzana ndi malipenga a magalimoto ndi mawilo ayenera kuchotsedwa mkati mwa deep neural network kuti asasokoneze zotsatira za speech recognition.
Chachiwiri, ngakhale mkati mwa dataset imodzi, kuchuluka kwa noise nthawi zambiri kumasiyana pakati pa sample ndi sample. (Izi n’zofanana ndi attention mechanisms; titatenga image dataset ngati chitsanzo, malo omwe chinthu chomwe tikuchifunacho chikupezeka angasiyane kuchithunzi ndi chithunzi, ndipo attention mechanisms zimatha kuyang’ana pamalo pachepo pa chithunzi chilichonse.)
Mwachitsanzo, pophunzitsa (training) cat-and-dog classifier, tiyeni tiganizire zithunzi zisanu zomwe zalembedwa kuti “dog” (galu). Chithunzi choyamba chikhoza kukhala ndi galu ndi mbewa, chachiwiri galu ndi tsekwe, chachitatu galu ndi nkhuku, chachinayi galu ndi bulu, ndipo chachisanu galu ndi bakha. Panthawi ya training, classifier idzasokonezedwa ndi zinthu zosafunikazi (irrelevant objects) monga mbewa, tsekwe, nkhuku, abulu, ndi abakha, zomwe zingachepetse classification accuracy. Ngati tingathe kuzizindikira zinthu zosafunikazi—mbewa, tsekwe, nkhuku, abulu, ndi abakha—ndikuchotsa ma features ake, n’zotheka kuonjezera accuracy ya cat-and-dog classifier.
2. Soft Thresholding
Soft thresholding ndi gawo lofunika kwambiri mu ma algorithms ambiri a signal denoising. Imachotsa features zomwe ma absolute values ake ndi ochepera poyerekeza ndi threshold inayake, ndipo imachepetsa (shrinks) features zomwe ma absolute values ake ndi okulirapo kuposa threshold imeneyi kudzera ku zero. Izi zingatheke pogwiritsa ntchito formula ili m’munsiyi:
\[y = \begin{cases} x - \tau & x > \tau \\ 0 & -\tau \le x \le \tau \\ x + \tau & x < -\tau \end{cases}\]Derivative ya output ya soft thresholding poyerekeza ndi input ndi:
\[\frac{\partial y}{\partial x} = \begin{cases} 1 & x > \tau \\ 0 & -\tau \le x \le \tau \\ 1 & x < -\tau \end{cases}\]Monga momwe zasonyezedwera pamwambapa, derivative ya soft thresholding ndi 1 kapena 0. Khalidwe limeneli ndi lofanana ndi la ReLU activation function. Choncho, soft thresholding ingathandizenso kuchepetsa chiopsezo cha deep learning algorithmsukumana ndi gradient vanishing ndi gradient exploding.
Mu soft thresholding function, kukhazikitsa threshold kuyenera kukwaniritsa zofunikira ziwiri: choyamba, threshold iyenera kukhala nambala ya positive; chachiwiri, threshold isapose mtengo waukulu (maximum value) wa input signal, apo ayi output yonse idzakhala zero.
Komanso, ndibwino kuti threshold ikwaniritse chofunikira chachitatu: sample iliyonse iyenera kukhala ndi threshold yakeyake yodziyimira payokha kutengera kuchuluka kwa noise yomwe ilinayo.
Izi zili choncho chifukwa kuchuluka kwa noise nthawi zambiri kumasiyana pakati pa ma samples. Mwachitsanzo, n’zofala mkati mwa dataset imodzi kuti Sample A ikhale ndi noise yochepa pomwe Sample B ili ndi noise yambiri. Pamenepa, pochita soft thresholding mu denoising algorithm, Sample A iyenera kugwiritsa ntchito threshold yaying’ono, pomwe Sample B iyenera kugwiritsa ntchito threshold yayikulu. Ngakhale kuti ma features ndi ma thresholds amenewa amataya matanthauzo ake enieni a physics mu deep neural networks, logic yake yoyambirira imakhalabe yofanana. Mwa kuyankhula kwina, sample iliyonse iyenera kukhala ndi threshold yakeyake yodziyimira payokha yomwe imatsimikiziridwa ndi noise content yake.
3. Attention Mechanism
Attention mechanisms ndizosavuta kuzimva mu gawo la computer vision. Maso a nyama amatha kusiyanitsa zinthu mwa kuyang’ana mwachangu malo onse, kenako ndikuyika chidwi (attention) pa chinthu chomwe akuchifuna kuti atenge tsatanetsatane wambiri uku akupewa zinthu zosafunika. Kuti mumve zambiri, chonde werengani zolemba zokhudza attention mechanisms.
Squeeze-and-Excitation Network (SENet) ndi njira yatsopano ya deep learning yomwe imagwiritsa ntchito attention mechanisms. Pakati pa ma samples osiyanasiyana, zopereka za ma feature channels osiyanasiyana pa ntchito ya classification nthawi zambiri zimasiyana. SENet imagwiritsa ntchito sub-network yaying’ono kuti ipeze set of weights (gulu la ma weights) ndiyeno ndikuchulukitsa (multiply) ma weights amenewa ndi features za m’ma channelswo kuti isinthe kukula kwa features mu channel iliyonse. Zimenezi tingazitenge ngati: Apply weighting to each feature channel.
Mu njira imeneyi, sample iliyonse imakhala ndi set of weights yakeyake. Mwa kuyankhula kwina, ma weights a ma samples awiri aliwonse ndi osiyana. Mu SENet, njira yeniyeni yopezera ma weights ndi “Global Pooling → Fully Connected Layer → ReLU Function → Fully Connected Layer → Sigmoid Function.”
4. Soft Thresholding with Deep Attention Mechanism
Deep Residual Shrinkage Network imatengera mapangidwe a SENet sub-network omwe tawatchula pamwambapa kuti igwiritse ntchito soft thresholding pansi pa deep attention mechanism. Kudzera mu sub-network (yomwe ili mu bokosi lofiira), tingathe ku-Learn a set of thresholds kuti tigwiritse ntchito pa feature channel iliyonse.
Mu sub-network imeneyi, ma absolute values a features onse mu input feature map amawerengedwa poyamba. Kenako, kudzera mu global average pooling ndi kuwerenga pafupifupi (averaging), timapeza feature imodzi, yomwe imadziwika ngati A. Mu njira inayo (Identity path ndi ina), feature map itadutsa global average pooling imalowetsedwa mu kachigawo kakang’ono ka fully connected network. Fully connected network imeneyi imagwiritsa ntchito Sigmoid function monga layer yake yomaliza kuti isinthe output kukhala pakati pa 0 ndi 1, zomwe zimatipatsa coefficient yotchedwa α. Threshold yomaliza ingalembedwe ngati α × A. Choncho, threshold ndi zotsatira za nambala ya pakati pa 0 ndi 1 yochulukitsidwa ndi average ya absolute values ya feature map. Njira imeneyi imaonetsetsa kuti threshold sikungokhala positive yokha, komanso siikhala yayikulu kwambiri.
Komanso, ma samples osiyanasiyana amakhala ndi ma thresholds osiyanasiyana. Chifukwa chake, pamlingo winawake, izi zingamveke ngati specialized attention mechanism: imazindikira features zosagwirizana ndi ntchito yomwe ilipo, ndikuzisandutsa kukhala pafupifupi zero kudzera mu ma convolutional layers awiri, ndikuziyika kukhala zero pogwiritsa ntchito soft thresholding; kapena, imazindikira features zokhudzana ndi ntchito yomwe ilipo, ndikuzisandutsa kukhala kutali ndi zero kudzera mu ma convolutional layers awiri, ndikuzisunga.
Pomaliza, mwa kuchita Stack many basic modules pamodzi ndi convolutional layers, batch normalization, activation functions, global average pooling, ndi fully connected output layers, Deep Residual Shrinkage Network yamphumphu imamangidwa.
5. Generalization Capability
Deep Residual Shrinkage Network, kwenikweni, ndi njira ya general feature learning. Izi zili choncho chifukwa, mu ntchito zambiri za feature learning, ma samples amakhala ndi noise kapena zinthu zosafunika (irrelevant information). Noise ndi zinthu zosafunikazi zingasokoneze kagwiridwe ntchito ka feature learning. Mwachitsanzo:
Mu image classification, ngati chithunzi chili ndi zinthu zina zambiri, zinthu zimenezi zingamveke ngati “noise.” Deep Residual Shrinkage Network ingathe kugwiritsa ntchito attention mechanism kuti izindikire “noise” imeneyi ndiyeno ndikugwiritsa ntchito soft thresholding kuti iyike features zokhudzana ndi “noise” imeneyi kukhala zero, zomwe zingathe kuonjezera image classification accuracy.
Mu speech recognition, makamaka m’malo omwe muli phokoso (noisy environments) monga pokambirana m’mbali mwa msewu kapena mkati mwa fakitale, Deep Residual Shrinkage Network ingathe kuonjezera speech recognition accuracy, kapena kupereka njira yomwe ingathandize kuonjezera speech recognition accuracy.
Reference
Minghang Zhao, Shisheng Zhong, Xuyun Fu, Baoping Tang, Michael Pecht, Deep residual shrinkage networks for fault diagnosis, IEEE Transactions on Industrial Informatics, 2020, 16(7): 4681-4690.
https://ieeexplore.ieee.org/document/8850096
BibTeX
@article{Zhao2020,
author = {Minghang Zhao and Shisheng Zhong and Xuyun Fu and Baoping Tang and Michael Pecht},
title = {Deep Residual Shrinkage Networks for Fault Diagnosis},
journal = {IEEE Transactions on Industrial Informatics},
year = {2020},
volume = {16},
number = {7},
pages = {4681-4690},
doi = {10.1109/TII.2019.2943898}
}
Academic Impact
Pepala limeneli lalandira ma citations opitilira 1,400 pa Google Scholar.
Kutengera ziwerengero zosakwanira, Deep Residual Shrinkage Network (DRSN) yagwiritsidwa ntchito mwachindunji kapena kusinthidwa ndikugwiritsidwa ntchito mu zofalitsa/maphunziro opitilira 1,000 m’madera osiyanasiyana, kuphatikizapo mechanical engineering, electrical power, vision, healthcare, speech, text, radar, ndi remote sensing.