Deep Residual Shrinkage Network ke mofuta o mofsa le o kaonefetšego wa Deep Residual Network. Gabotse, ke kopanyo ya Deep Residual Network, di-attention mechanism, le di-soft thresholding function.
Ka tsela e itšego, tsela yeo Deep Residual Shrinkage Network e šomago ka yona e ka kwešišwa ka tsela ye: e šomiša di-attention mechanism go lemoga di-feature tšeo di sego bohlokwa, gomme e šomiša di-soft thresholding function go di beha go zero (go di phumula); ka lehlakoreng le lengwe, e lemoga di-feature tša bohlokwa gomme ya di boloka. Tshepedišo ye e matlafatša bokgoni bja deep neural network bja go ntšha di-feature tša bohlokwa go tšwa go di-signal tšeo di nago le lešata (noise).
1. Maikemišetšo a Nyakišišo (Research Motivation)
Sa pele, ge re aroganya di-sample (classifying samples), go ba gona ga lešata (noise)—go swana le Gaussian noise, pink noise, le Laplacian noise—ke selo se se ka se phemegego. Ka kakaretšo, di-sample gantši di ba le tshedimošo yeo e sa nyakegego go mošomo wa bjale wa classification, gomme tshedimošo ye e ka tšewa bjalo ka lešata. Lešata le le ka ama phihlelelo ya classification gampe. (Soft thresholding ke kgato ya bohlokwa kudu mešamong ye mentši ya signal denoising).
Mohlala, ge batho ba bolela thoko ga tsela, modumo wa poledišano o ka hlakana le medumo ya dikaara le maotwana. Ge re dira speech recognition go di-signal tše, dipholo (results) di tla amega ka baka la medumo ye ya ka morago. Go ya ka pono ya deep learning, di-feature tšeo dielanago le dikaara le maotwana di swanetše go ntšhiwa ka gare ga deep neural network gore di se ke tša ama dipholo tša speech recognition.
Sa bobedi, gaešita le ka gare ga dataset e tee, bontši bja lešata gantši bo a fapana go tloga go sample e nngwe go ya go e nngwe. (Se se swana le di-attention mechanism; ge re tšea dataset ya diswantšho [images] e le mohlala, lefelo la selo seo re se lebeletšego le ka fapana go ya ka diswantšho, gomme attention mechanism e kgona go lebana le lefelo leo le itšego mo seswantšhong se sengwe le se sengwe).
Mohlala, ge re ruta (train) classifier ya dimpša le dikatse, a re tšee diswantšho tše hlano tšeo di ngwadilwego “mpša”. Seswantšho sa pele se ka ba le mpša le legotlo, sa bobedi sa ba le mpša le leganse, sa boraro sa ba le mpša le kgogo, sa bone sa ba le mpša le pokolo, gomme sa bohlano sa ba le mpša le pidipidi. Nakong ya training, classifier e tla tshwenywa ke dilo tšeo di sa nyakegego bjalo ka magotlo, maganse, dikgogo, dipokolo, le dipidipidi, e lego se se ka fokotšago go nepa ga classification. Ge re ka kgona go lemoga dilo tše tšeo di sa nyakegego—magotlo, maganse, dikgogo, dipokolo, le dipidipidi—gomme ra ntšha di-feature tša tšona, go a kgonega go kaonafatša go nepa ga classifier ya dimpša le dikatse.
2. Soft Thresholding
Soft thresholding ke kgato ya bohlokwa kudu ditshepedišong tše mentši tša signal denoising. E phumula di-feature tšeo di-absolute value tša tšona di lego ka tlase ga threshold e itšego, gomme e fokotša (shrinks) di-feature tšeo di-absolute value tša tšona di lego ka godimo ga threshold ye go ya ntlheng ya zero. E ka dirwa ka go šomiša formula ye:
\[y = \begin{cases} x - \tau & x > \tau \\ 0 & -\tau \le x \le \tau \\ x + \tau & x < -\tau \end{cases}\]Derivative ya dipholo tša soft thresholding mabapi le input ke:
\[\frac{\partial y}{\partial x} = \begin{cases} 1 & x > \tau \\ 0 & -\tau \le x \le \tau \\ 1 & x < -\tau \end{cases}\]Bjalo ka ge go bontšhitšwe ka godimo, derivative ya soft thresholding ke 1 goba 0. Sebopego se se swana le sa ReLU activation function. Ka fao, soft thresholding e ka fokotša kotsi ya gore di-algorithm tša deep learning di kopane le mathata a gradient vanishing le gradient exploding.
Ka gare ga soft thresholding function, go bewa ga threshold go swanetše go kgotsofatša maemo a mabedi: sa pele, threshold e swanetše go ba palo ya positive; sa bobedi, threshold e ka se fete boleng bja godimo bja input signal, go sego bjalo dipholo di tla ba zero ka moka.
Go tlaleletša, go ka ba kaone gore threshold e kgotsofatše seemo sa boraro: sample e nngwe le e nngwe e be le threshold ya yona e ikemetšego go ya ka bontši bja lešata la yona.
Lebaka ke gore bontši bja lešata gantši bo a fapana gare ga di-sample. Mohlala, go tlwaelegile gore ka gare ga dataset e tee, Sample A e be le lešata le lennyane mola Sample B e na le lešata le lentši. Maemong a, ge re dira soft thresholding ka gare ga denoising algorithm, Sample A e swanetše go šomiša threshold e nnyane, mola Sample B e swanetše go šomiša threshold e kgolo. Le ge di-feature tše le di-threshold di lahlegelwa ke tlhaloso ya tšona ya nnete ya physics ka gare ga di-deep neural network, logic ya motheo e a swana. Ka mantšu a mangwe, sample e nngwe le e nngwe e swanetše go ba le threshold ya yona e ikemetšego yeo e laolwago ke lešata la yona.
3. Attention Mechanism
Di-attention mechanism di bonolo go di kwešiša lephewaneng la computer vision. Mahlo a diphoofolo a kgona go aroganya dilo tšeo a di lebeletšego ka go sekena (scan) lefelo ka moka ka lebelo, gomme a lebiša tlhokomelo (attention) go selo seo, go ntšha dintlha tše dintši mola a hlokomologa tshedimošo yeo e sego ya bohlokwa. Bakeng sa dintlha tše dingwe, hle lebelela dingwalwa tša go amana le di-attention mechanism.
Squeeze-and-Excitation Network (SENet) e emela mokgwa o mofsa wa deep learning wo o šomišago di-attention mechanism. Gare ga di-sample tša go fapana, seabe sa di-feature channel tša go fapana mo mošomong wa classification gantši se a fapana. SENet e šomiša sub-network e nnyane go hwetša sehlopha sa di-weight (Learn a set of weights), gomme ya atiša di-weight tše ka di-feature tša di-channel tšeo go beakanya bogolo bja di-feature ka gare ga channel e nngwe le e nngwe (Apply weighting to each feature channel). Tshepedišo ye e ka bonwa bjalo ka go bea maemo a go fapana a attention go di-feature channel tša go fapana.
Ka mokgwa wo, sample e nngwe le e nngwe e na le sehlopha sa yona sa di-weight tše di ikemetšego. Ka mantšu a mangwe, di-weight tša di-sample tše pedi dipe fela di a fapana. Ka gare ga SENet, tsela ya go hwetša di-weight ke “Global Pooling → Fully Connected Layer → ReLU Function → Fully Connected Layer → Sigmoid Function.”
4. Soft Thresholding le Deep Attention Mechanism
Deep Residual Shrinkage Network e hwetša mohlala go tšwa go sebopego sa sub-network ya SENet yeo re boletšego ka yona, go phethagatša soft thresholding ka tlase ga deep attention mechanism. Ka tšhomišo ya sub-network (yeo e bontšhitšwego ka gare ga lepokisi le le khubedu), sehlopha sa di-threshold se ka ithutwa (Learn a set of thresholds) go šomiša soft thresholding go feature channel e nngwe le e nngwe.
Ka gare ga sub-network ye, di-absolute value tša di-feature ka moka tša input feature map di a balwa pele. Gomme, ka go šomiša global average pooling le go dira average, go hwetšwa feature, yeo e bitšwago A. Mo tseleng e nngwe, feature map ya ka morago ga global average pooling e tsenywa ka gare ga fully connected network e nnyane. Fully connected network ye e šomiša Sigmoid function bjalo ka layer ya yona ya mafelelo go beakanya (normalize) dipholo magareng ga 0 le 1, gomme re hwetša coefficient yeo e bitšwago α. Threshold ya mafelelo e ka ngwalwa bjalo ka α × A. Ka fao, threshold ke palotišo (product) ya palo ya magareng ga 0 le 1 le average ya di-absolute value tša feature map. Mokgwa wo o netefatša gore threshold ke positive gomme ga e kgolo kudu.
Go feta fao, di-sample tša go fapana di feleletša di na le di-threshold tša go fapana. Ka baka leo, ka tsela e itšego, se se ka kwešišwa bjalo ka attention mechanism e kgethegileng: e lemoga di-feature tšeo di sa nyakegego go mošomo wa bjale, e di fetolela go di-value tša kgauswi le zero ka di-convolutional layer tše pedi, gomme e di beha go zero ka go šomiša soft thresholding; ka lehlakoreng le lengwe, e lemoga di-feature tšeo di nyakegago, e di fetolela go di-value tša kgole le zero, gomme e di boloka.
Mafelelong, ka go kgobokanya (Stack many basic modules) palo e itšego ya di-module tša motheo mmogo le di-convolutional layer, batch normalization, di-activation function, global average pooling, le fully connected output layers, Deep Residual Shrinkage Network e feletšego e a agwa.
5. Bokgoni bja Kakaretšo (Generalization Capability)
Deep Residual Shrinkage Network, bonneteng, ke mokgwa wa kakaretšo wa go ithuta di-feature (general feature learning method). Lebaka ke gore, mešomong e mentši ya go ithuta di-feature, di-sample di na le lešata le tshedimošo e e sa nyakegego. Lešata le le tshedimošo ye e sa nyakegego di ka ama bokgoni bja go ithuta di-feature. Mohlala:
Mo go aroganyeng diswantšho (image classification), ge seswantšho se na le dilo tše dingwe tše dintši, dilo tše di ka kwešišwa bjalo ka “lešata”. Deep Residual Shrinkage Network e ka kgona go šomiša attention mechanism go lemoga “lešata” le, gomme ya šomiša soft thresholding go beha di-feature tša “lešata” le go zero, ka fao ya kaonafatša go nepa ga image classification.
Mo go speech recognition, kudu maemong a lešata bjalo ka poledišano thoko ga tsela goba ka gare ga feketori, Deep Residual Shrinkage Network e ka kaonafatša go nepa ga speech recognition, goba bonyane, ya fa mokgwa woo o ka kgonago go kaonafatša go nepa ga speech recognition.
Reference
Minghang Zhao, Shisheng Zhong, Xuyun Fu, Baoping Tang, Michael Pecht, Deep residual shrinkage networks for fault diagnosis, IEEE Transactions on Industrial Informatics, 2020, 16(7): 4681-4690.
https://ieeexplore.ieee.org/document/8850096
BibTeX
@article{Zhao2020,
author = {Minghang Zhao and Shisheng Zhong and Xuyun Fu and Baoping Tang and Michael Pecht},
title = {Deep Residual Shrinkage Networks for Fault Diagnosis},
journal = {IEEE Transactions on Industrial Informatics},
year = {2020},
volume = {16},
number = {7},
pages = {4681-4690},
doi = {10.1109/TII.2019.2943898}
}
Khuetšo ya Tša Thuto (Academic Impact)
Pampiri ye (paper) e na le di-citation tša go feta 1,400 go Google Scholar.
Go ya ka dipalopalo tšeo di sa felelago, Deep Residual Shrinkage Network (DRSN) e šomišitšwe thwii goba ya fetolwa gomme ya šomišwa ka gare ga dingwalwa/dinyakišišo tša go feta 1,000 mapheleng a go fapana, go akaretša mechanical engineering, electrical power, vision, healthcare, speech, text, radar, le remote sensing.