Deep Residual Shrinkage Network ke mofuta o o tokafaditsweng wa Deep Residual Network. Ka bokhutshwane, ke kopano ya Deep Residual Network, attention mechanisms, le soft thresholding functions.
Ka tsela nngwe, tsela e Deep Residual Shrinkage Network e berekang ka yone e ka tlhaloganngwa jaana: e dirisa attention mechanisms go bona di-features tse di sa tlhokegeng (unimportant features) mme e dirisa soft thresholding functions go di fetola go nna zero; ka fa letlhakoreng le lengwe, e bona di-features tse di botlhokwa mme e a di boloka. Tsamaiso eno e tokafatsa bokgoni jwa deep neural network jwa go ntsha di-features tse di mosola mo di-signals tse di nang le noise.
1. Research Motivation
Sa ntlha, fa re arola (classify) di-sample, go nna teng ga noise—jaaka Gaussian noise, pink noise, le Laplacian noise—ga go thibelesege. Ka kakaretso, gantsi di-sample di na le tshedimosetso e e sa amaneng le tiro ya classification ya ga jaana, e le yone e ka tsewang jaaka noise. Noise e, e ka ama maduo a classification ka tsela e e sa siamang. (Soft thresholding ke kgato ya botlhokwa mo di-algorithm tse dintsi tsa signal denoising.)
Sekai, fa o bua le motho fa thoko ga tsela, modumo wa lentswe o ka nna wa tswakana le modumo wa di-hono tsa dikoloi le maotwana. Fa re dirisa speech recognition mo di-signal tseno, dipholo di tla amiwa ke medumo e ya ka kwa morago (background sounds). Go ya ka pono ya deep learning, di-features tse di amanang le di-hono le maotwana di tshwanetse go ntshiwa mo teng ga deep neural network gore di se ka tsa ama dipholo tsa speech recognition.
Sa bobedi, le fa re ka leba dataset e le nngwe, selekanyo sa noise gantsi se a farologana magareng ga di-sample. (Seno se na le tshwano le attention mechanisms; fa re tsaya sekai sa image dataset, lefelo le selo se re se batlang (target object) se leng mo go lone le ka farologana mo ditshwantshong, mme attention mechanisms di kgona go tsepamisa mogopolo mo lefelong leo le le rileng mo setshwantshong sengwe le sengwe.)
Sekai, fa re ruta (train) classifier ya katse-le-ntša (cat-and-dog), a re lebe ditshwantsho di le tlhano tse di nang le leina la “dog” (ntša). Setshwantsho sa ntlha se ka nna le ntša le legotlo, sa bobedi sa nna le ntša le ngongola (goose), sa boraro sa nna le ntša le koko, sa bone sa nna le ntša le tonki, mme sa botlhano sa nna le ntša le pidipidi (duck). Ka nako ya training, classifier e tla amiwa ke dilo tseno tse di sa tlhokegeng jaaka magotlo, dingongola, dikoko, ditonki le dipidipidi, mme seo se ka fokotsa accuracy ya classification. Fa re ka kgona go bona dilo tseno tse di sa tlhokegeng—magotlo, dingongola, dikoko, ditonki le dipidipidi—mme ra ntsha di-features tsa tsona, go a kgonega go tokafatsa accuracy ya classifier ya katse-le-ntša.
2. Soft Thresholding
Soft thresholding ke kgato ya konokono mo di-algorithm tse dintsi tsa signal denoising. E tlosa di-features tse absolute values tsa tsona di leng kwa tlase ga threshold e e rileng, mme e gogela (shrinks) di-features tse absolute values tsa tsona di leng kwa godimo ga threshold e, ntlheng ya zero. E ka dirwa go dirisiwa formula e e latelang:
\[y = \begin{cases} x - \tau & x > \tau \\ 0 & -\tau \le x \le \tau \\ x + \tau & x < -\tau \end{cases}\]Derivative ya dipholo tsa soft thresholding mabapi le input ke:
\[\frac{\partial y}{\partial x} = \begin{cases} 1 & x > \tau \\ 0 & -\tau \le x \le \tau \\ 1 & x < -\tau \end{cases}\]Jaaka go bontshitswe fa godimo, derivative ya soft thresholding ke 1 kgotsa 0. Ponalo eno e tshwana le ya ReLU activation function. Ka jalo, soft thresholding e ka fokotsa kotsi ya gore di-algorithm tsa deep learning di kopane le mathata a gradient vanishing le gradient exploding.
Mo soft thresholding function, go tlhopiwa ga threshold go tshwanetse go fitlhelela dipeelo di le pedi: sa ntlha, threshold e tshwanetse go nna positive number; sa bobedi, threshold ga e a tshwanela go feta maximum value ya input signal, go seng jalo dipholo e tla nna zero yotlhe.
Gape, go botoka gore threshold e fitlhelele sepeelo sa boraro: sample nngwe le nngwe e tshwanetse go nna le threshold ya yone e e ikemetseng go ya ka selekanyo sa yone sa noise.
Lebaka ke gore selekanyo sa noise gantsi se a farologana magareng ga di-sample. Sekai, go tlwaelegile mo dataset e le nngwe gore Sample A e nne le noise e nnye fa Sample B e na le noise e ntsi. Mo maemong a, fa re dira soft thresholding mo denoising algorithm, Sample A e tshwanetse go dirisa threshold e nnye, fa Sample B e tshwanetse go dirisa threshold e kgolo. Le fa di-features tseno le di-thresholds di latlhegelwa ke tlhaloso ya tsona ya mmatota ya “physical definition” mo teng ga deep neural networks, molaomotheo wa motheo o ntse o a tshwana. Ka mafoko a mangwe, sample nngwe le nngwe e tshwanetse go nna le threshold ya yone e e ikemetseng e e laolwang ke noise content ya yone.
3. Attention Mechanism
Attention mechanisms di bonolo go di tlhaloganya mo lekaleng la computer vision. Matlho a diphologolo a kgona go farologanya dilo ka go scan-a lefelo lotlhe ka bonako, mme morago a tsepamise mogopolo (focus attention) mo selong se se batlwang (target object) go bona dintlha tse dingwe, fa a ntse a itlhokomolosa tshedimosetso e e sa tlhokegeng. Bakeng sa dintlha tse di tletseng, tsweetswee leba dikwalo tse di amanang le attention mechanisms.
Squeeze-and-Excitation Network (SENet) ke mokgwa o mosha wa deep learning o o dirisang attention mechanisms. Mo di-sample tse di farologaneng, seabe sa di-feature channel tse di farologaneng mo tirong ya classification gantsi se a farologana. SENet e dirisa sub-network e nnye go Learn a set of weights (ithuta sete ya di-weights), mme morago e atisa di-weights tseno ka di-features tsa di-channel tse di amegang. Seno se dirwa go Apply weighting to each feature channel (go fetola bogolo jwa di-features). Tsamaiso e, e ka lejwa jaaka go dirisa maemo a a farologaneng a attention mo di-feature channel tse di farologaneng.
Mo mokgweng o, sample nngwe le nngwe e na le sete ya yone ya di-weights e e ikemetseng. Ka mafoko a mangwe, di-weights tsa di-sample dipe fela tse pedi di a farologana. Mo SENet, tsela e e rileng ya go bona di-weights ke “Global Pooling → Fully Connected Layer → ReLU Function → Fully Connected Layer → Sigmoid Function.”
4. Soft Thresholding ka Deep Attention Mechanism
Deep Residual Shrinkage Network e tsaya malebela mo sebopegong sa sub-network sa SENet se se umakilweng fa godimo go tirisa soft thresholding ka fa tlase ga deep attention mechanism. Ka sub-network eno (e e bontshitsweng ka lebokoso le le khibidu), network e kgona go Learn a set of thresholds (ithuta di-thresholds) go dira soft thresholding mo feature channel nngwe le nngwe.
Mo sub-network e, re bala di-absolute values tsa di-features tsotlhe pele. Morago ga moo, ka global average pooling le go average-a, re bona feature, e re e bitsang A. Mo tseleng e nngwe, feature map e e tswang mo global average pooling e tsenngwa mo fully connected network e nnye. Fully connected network eno e dirisa Sigmoid function jaaka layer ya bofelo go lekanyetsa (normalize) dipholo magareng ga 0 le 1, mme re bona coefficient e re e bitsang α. Threshold ya bofelo e ka kwadiwa jaaka α × A. Ka jalo, threshold ke seatiso (product) sa nomoro e e magareng ga 0 le 1 le average ya di-absolute values tsa feature map. Mokgwa o, o netefatsa gore threshold e nna positive mme gape ga e nne kgolo thata.
Gape, di-sample tse di farologaneng di nna le di-thresholds tse di farologaneng. Ka jalo, go fitlha bokgakaleng bo le rileng, seno se ka tlhaloganngwa jaaka mofuta o o kgethegileng wa attention mechanism: e kgona go bona features tse di sa amaneng le tiro ya ga jaana, e di fetola go nna dipalo tse di gaufi le zero ka di-convolutional layers di le pedi, mme e di fetola zero e dirisa soft thresholding; kgotsa, e bona features tse di amanang le tiro ya ga jaana, e di fetola go nna dipalo tse di kgakala le zero ka di-convolutional layers di le pedi, mme e di boloka (preserves them).
La bofelo, ka go Stack many basic modules (goboka di-module tsa motheo di le mmalwa) mmogo le di-convolutional layers, batch normalization, activation functions, global average pooling, le fully connected output layers, re aga Deep Residual Shrinkage Network e e feletseng. Mo setshwantshong, re bona Identity path e e thusang go tsamaisa tshedimosetso, le karolo ya Weighting e e laolang botlhokwa jwa dikarolo tse di farologaneng.
5. Generalization Capability
Deep Residual Shrinkage Network, tota, ke mokgwa wa feature learning wa kakaretso. Lebaka ke gore, mo ditiro tse dintsi tsa feature learning, di-sample di na le selekanyo sa noise le tshedimosetso e e sa tlhokegeng. Noise e le tshedimosetso e e sa tlhokegeng di ka ama bokgoni jwa feature learning. Sekai:
Mo image classification, fa setshwantsho se na le dilo tse dingwe tse dintsi, dilo tseno di ka tlhaloganngwa jaaka “noise.” Deep Residual Shrinkage Network e ka kgona go dirisa attention mechanism go ela tlhoko “noise” e, mme morago ya dirisa soft thresholding go fetola di-features tsa “noise” eno go nna zero, mme seno se ka tokafatsa accuracy ya image classification.
Mo speech recognition, bogolo jang mo mafelong a a nang le modumo o montsi jaaka fa thoko ga tsela kgotsa mo feketering, Deep Residual Shrinkage Network e ka tokafatsa accuracy ya speech recognition, kgotsa bonyane, ya tlhagisa mokgwa o o ka kgonang go tokafatsa accuracy ya speech recognition.
Reference
Minghang Zhao, Shisheng Zhong, Xuyun Fu, Baoping Tang, Michael Pecht, Deep residual shrinkage networks for fault diagnosis, IEEE Transactions on Industrial Informatics, 2020, 16(7): 4681-4690.
https://ieeexplore.ieee.org/document/8850096
BibTeX
@article{Zhao2020,
author = {Minghang Zhao and Shisheng Zhong and Xuyun Fu and Baoping Tang and Michael Pecht},
title = {Deep Residual Shrinkage Networks for Fault Diagnosis},
journal = {IEEE Transactions on Industrial Informatics},
year = {2020},
volume = {16},
number = {7},
pages = {4681-4690},
doi = {10.1109/TII.2019.2943898}
}
Academic Impact
Pampiri eno (paper) e nopotswe (cited) makgetlo a a fetang 1,400 mo Google Scholar.
Go ya ka dipalopalo tse di seng di feletse, Deep Residual Shrinkage Network (DRSN) e setse e dirisitswe ka tlhamalalo kgotsa e fetotswe mme ya dirisiwa mo dikgatisong/dipatlisiso tse di fetang 1,000 mo makaleng a a farologaneng, go akaretsa boenjinere jwa metšhine (mechanical engineering), motlakase (electrical power), vision, kalafi (healthcare), speech, text, radar, le remote sensing.