Deep Residual Shrinkage Network ke mofuta o ntlafalitsoeng oa Deep Residual Network. Ha e le hantle, ke kopanyo ea Deep Residual Network, attention mechanisms, le soft thresholding functions.
Ka tekanyo e itseng, tsela eo Deep Residual Shrinkage Network e sebetsang ka eona e ka utloisisoa ka tsela e latelang: e sebelisa attention mechanisms ho hlokomela features tse sa hlokeheng (unimportant features), ‘me e sebelisa soft thresholding functions ho li beha ho zero; ka lehlakoreng le leng, e hlokomela features tsa bohlokoa ‘me ea li boloka. Ts’ebetso ena e matlafatsa bokhoni ba deep neural network ba ho ntša (extract) features tsa bohlokoa ho tsoa ho signals tse nang le noise.
1. Research Motivation (Sepheo sa Lipatlisiso)
Ntlha ea pele, ha re arola (classify) li-sample, ho ba teng ha noise—joalo ka Gaussian noise, pink noise, le Laplacian noise—ke ntho e ke keng ea qojoa. Ka kutloisiso e pharaletseng, li-sample hangata li na le tlhahisoleseding e sa amaneng le mosebetsi oa hajoale oa classification, ‘me tlhahisoleseding ena e ka nkuoa e le noise. Noise ena e ka ama ts’ebetso ea classification hampe. (Soft thresholding ke mohato oa bohlokoa ho li-algorithms tse ngata tsa signal denoising.)
Mohlala, nakong ea moqoqo pel’a tsela, molumo oa audio o ka tsoakana le melumo ea likoloi le mabili. Ha re etsa speech recognition ho signals tsena, liphetho li tla ameha ka lebaka la melumo ena e ka morao (background sounds). Ho latela pono ea deep learning, features tse amanang le likoloi le mabili li lokela ho tlosoa ka har’a deep neural network ho thibela hore li se ke tsa ama liphetho tsa speech recognition.
Ntlha ea bobeli, esita le ka har’a dataset e le ‘ngoe, boholo ba noise hangata boa fapana ho tloha sample e ‘ngoe ho ea ho e ‘ngoe. (Sena se tšoana le attention mechanisms; ha re nka dataset ea litšoantšo e le mohlala, sebaka sa ntho eo re e shebileng (target object) se ka fapana lipakeng tsa litšoantšo, ‘me attention mechanisms li ka shebana le sebaka se itseng sa ntho eo setšoantšong ka seng.)
Mohlala, ha re etsa training ea classifier ea ntja le katse (cat-and-dog classifier), nahana ka litšoantšo tse hlano tse ngotsoeng “dog”. Setšoantšo sa pele se ka ‘na sa e-ba le ntja le mouse, sa bobeli ntja le goose, sa boraro ntja le chicken, sa bone ntja le donkey, ‘me sa bohlano ntja le duck. Nakong ea training, classifier e tla sitisoa ke lintho tsena tse sa amaneng le eona joalo ka mice, geese, chickens, donkeys, le ducks, ‘me sena se baka ho fokotseha ha classification accuracy. Haeba re ka hlokomela lintho tsena tse sa amaneng—tse joalo ka mice, geese, chickens, donkeys, le ducks—’me ra tlosa features tsa tsona, ho ka khoneha ho ntlafatsa accuracy ea cat-and-dog classifier.
2. Soft Thresholding
Soft thresholding ke mohato oa bohlokoa ho li-algorithms tse ngata tsa signal denoising. E tlosa features tseo absolute values tsa tsona li leng tlase ho threshold e itseng, ‘me e fokotsa (shrinks) features tseo absolute values tsa tsona li leng holimo ho threshold ena ho liisa nokeng ea zero. E ka sebelisoa ho latela formula e latelang:
\[y = \begin{cases} x - \tau & x > \tau \\ 0 & -\tau \le x \le \tau \\ x + \tau & x < -\tau \end{cases}\]Derivative ea output ea soft thresholding mabapi le input ke:
\[\frac{\partial y}{\partial x} = \begin{cases} 1 & x > \tau \\ 0 & -\tau \le x \le \tau \\ 1 & x < -\tau \end{cases}\]Joalokaha ho bontšitsoe ka holimo, derivative ea soft thresholding ke 1 kapa 0. Tšobotsi ena e tšoana le ea ReLU activation function. Ka hona, soft thresholding e ka boela ea fokotsa kotsi ea hore deep learning algorithms li kopane le gradient vanishing le gradient exploding.
Ka har’a soft thresholding function, ho beha threshold ho tlameha ho khotsofatsa maemo a mabeli: la pele, threshold e tlameha ho ba positive number; la bobeli, threshold e ke ke ea feta maximum value ea input signal, ho seng joalo output e tla ba zero kaofela.
Ho feta moo, ho molemo hore threshold e khotsofatse boemo ba boraro: sample ka ‘ngoe e lokela ho ba le threshold ea eona e ikemetseng ho latela boholo ba noise ea eona.
Sena ke hobane boholo ba noise hangata bo fapana har’a li-sample. Mohlala, ho tloaelehile hore ka har’a dataset e le ‘ngoe, Sample A e be le noise e nyane ha Sample B e na le noise e ngata. Tabeng ena, ha re etsa soft thresholding ho algorithm ea denoising, Sample A e lokela ho sebelisa threshold e nyane, ha Sample B e lokela ho sebelisa threshold e kholo. Le hoja features tsena le thresholds li lahleheloa ke tlhaloso ea tsona ea ‘mele (physical definitions) ka har’a deep neural networks, mohopolo oa mantlha o ntse o tšoana. Ka mantsoe a mang, sample ka ‘ngoe e lokela ho ba le threshold ea eona e ikemetseng e ipapisitseng le boholo ba noise ea eona.
3. Attention Mechanism
Attention mechanisms li batla li le bonolo ho li utloisisa lefapheng la computer vision. Litsamaiso tsa pono tsa liphoofolo li ka khetholla liphofu ka ho lekola (scan) sebaka sohle kapele, ebe li tsepamisa maikutlo (focus attention) nthong eo li e shebileng ho fumana lintlha tse ngata ha li ntse li hlokomoloha tlhahisoleseding e sa hlokahaleng. Bakeng sa lintlha tse ling, ka kopo sheba lingoliloeng tse mabapi le attention mechanisms.
Squeeze-and-Excitation Network (SENet) e emela mokhoa o mocha oa deep learning o sebelisang attention mechanisms. Ho li-sample tse fapaneng, seabo sa feature channels tse fapaneng mosebetsing oa classification hangata sea fapana. SENet e sebelisa sub-network e nyane ho fumana a set of weights ‘me e atisa (multiplies) weights tsena ka features tsa channels tse fapaneng ho fetola boholo ba features ho channel ka ‘ngoe. Ts’ebetso ena e ka nkuoa e le “Apply weighting to each feature channel” (ho sebelisa weighting ho feature channel ka ‘ngoe).
Ka mokhoa ona, sample ka ‘ngoe e na le set of weights ea eona e ikemetseng. Ka mantsoe a mang, weights bakeng sa li-sample tse peli leha e le life li fapane. Ho SENet, tsela e tobileng ea ho fumana weights ke “Global Pooling → Fully Connected Layer → ReLU Function → Fully Connected Layer → Sigmoid Function.”
4. Soft Thresholding with Deep Attention Mechanism
Deep Residual Shrinkage Network e fumana ts’usumetso ho tsoa ho sebopeho sa SENet sub-network se boletsoeng ka holimo ho phethahatsa soft thresholding tlasa deep attention mechanism. Ka tšebeliso ea sub-network (e bontšitsoeng ka har’a lebokose le lefubelu), re khona ho “Learn a set of thresholds” (ithuta sete ea thresholds) ho etsa soft thresholding ho feature channel ka ‘ngoe.
Ka har’a sub-network ena, absolute values tsa features tsohle tse ho input feature map lia baloa pele. Joale, ka global average pooling le ho etsa karolelano (averaging), re fumana feature, e ngotsoeng joalo ka A. Tseleng e ‘ngoe, feature map ka mor’a global average pooling e kenngoa ka har’a fully connected network e nyane. Network ena e sebelisa Sigmoid function e le layer ea ho qetela ho etsa hore output e be pakeng tsa 0 le 1, e fana ka coefficient e ngotsoeng joalo ka α. Threshold ea ho qetela e ka hlalosoa e le α × A. Ka hona, threshold ke sehlahisoa (product) sa nomoro e pakeng tsa 0 le 1 le karolelano ea absolute values tsa feature map. Mokhoa ona o tiisa hore threshold ha e positive feela empa hape ha e kholo haholo.
Ho feta moo, li-sample tse fapaneng li fella ka thresholds tse fapaneng. Ka lebaka leo, ka tekanyo e itseng, sena se ka utloisisoa e le attention mechanism e ikhethang: e hlokomela features tse sa amaneng le mosebetsi oa hajoale, e li fetola ho ba lipalo tse haufi le zero ka li-convolutional layers tse peli, ‘me e li beha ho zero ka ho sebelisa soft thresholding; kapa, e hlokomela features tse amanang le mosebetsi oa hajoale, e li fetola ho ba lipalo tse hole le zero ka li-convolutional layers tse peli, ‘me ea li boloka.
Qetellong, ka ho etsa “Stack many basic modules” (ho paka li-module tsa motheo tse ngata) hammoho le convolutional layers, batch normalization, activation functions, global average pooling, le fully connected output layers, Deep Residual Shrinkage Network e felletseng ea hahuoa. Re boetse re bona likarolo tse ling tsa bohlokoa joalo ka “Identity path” le “Weighting” litšoantšong tsa rona.
5. Generalization Capability (Bokhoni ba Kakaretso)
Deep Residual Shrinkage Network, ha e le hantle, ke mokhoa o akaretsang oa feature learning. Sena ke hobane, mesebetsing e mengata ea feature learning, li-sample li batla li e-na le noise e itseng hammoho le tlhahisoleseding e sa hlokahaleng. Noise ena le tlhahisoleseding e sa hlokahaleng li ka ama ts’ebetso ea feature learning. Mohlala:
Ho image classification, haeba setšoantšo se na le lintho tse ling tse ngata ka nako e le ‘ngoe, lintho tsena li ka utloisisoa e le “noise.” Deep Residual Shrinkage Network e ka khona ho sebelisa attention mechanism ho hlokomela “noise” ena ebe e sebelisa soft thresholding ho beha features tse amanang le “noise” ena ho zero, ka hona ho ka etsahala hore e ntlafatse image classification accuracy.
Ho speech recognition, haholo-holo libakeng tse nang le lerata (noisy environments) joalo ka maemo a moqoqo pel’a tsela kapa ka har’a workshop ea fektheri, Deep Residual Shrinkage Network e ka ntlafatsa speech recognition accuracy, kapa bonyane, e fana ka mokhoa o khonang ho ntlafatsa ho nepahala ha speech recognition.
Reference
Minghang Zhao, Shisheng Zhong, Xuyun Fu, Baoping Tang, Michael Pecht, Deep residual shrinkage networks for fault diagnosis, IEEE Transactions on Industrial Informatics, 2020, 16(7): 4681-4690.
https://ieeexplore.ieee.org/document/8850096
BibTeX
@article{Zhao2020,
author = {Minghang Zhao and Shisheng Zhong and Xuyun Fu and Baoping Tang and Michael Pecht},
title = {Deep Residual Shrinkage Networks for Fault Diagnosis},
journal = {IEEE Transactions on Industrial Informatics},
year = {2020},
volume = {16},
number = {7},
pages = {4681-4690},
doi = {10.1109/TII.2019.2943898}
}
Academic Impact (Tšusumetso ea Lithuto)
Pampiri ena e fumane litemana (citations) tse fetang 1,400 ho Google Scholar.
Ho latela lipalo-palo tse sa fellang, Deep Residual Shrinkage Network (DRSN) e sebelisitsoe ka kotloloho kapa e ntlafalitsoe ‘me ea sebelisoa likhatisong/liphuputsong tse fetang 1,000 makaleng a fapaneng, ho kenyeletsoa mechanical engineering, electrical power, vision, healthcare, speech, text, radar, le remote sensing.