Deep Residual Shrinkage Network nyí Deep Residual Network e è vɔ́ jlaɖó é. Ðò taji ɔ, Deep Residual Shrinkage Network nɔ kplé Deep Residual Network, attention mechanisms, kpodo soft thresholding functions kpan ɖó kpɔ́.
Mi hɛn ɔ mi na mɔ nukúnnú jɛ nǔ e Deep Residual Shrinkage Network nɔ wa é mɛ gbɔn ali elɔ nu. Nukɔn tɔn ɔ, network ɔ nɔ zán attention mechanisms dó tuùn features e ma ɖò taji ǎ lɛ é (unimportant features). Enɛ gudo ɔ, network ɔ nɔ zán soft thresholding functions dó sɔ́ features ma ɖò taji enɛ lɛ dó huzu zero. Lo ɔ, network ɔ nɔ tuùn features e ɖò taji lɛ é (important features) bo nɔ hɛn ye ɖó te. Nǔ wiwa elɔ nɔ na hlɔnhlɔn deep neural network ɔ. É nɔ d’alɔ network ɔ bɔ é nɔ ɖe features ɖagbe lɛ tɔ́n sín signals e mɛ noise ɖè lɛ é mɛ.
1. Nǔ e wu è wa numɛkunkun elɔ é (Research Motivation)
Nukɔn tɔn ɔ, hwenu e algorithm ɔ ɖò samples lɛ má wɛ ɔ, noise nɔ hɔn na ǎ. Noise sín kpɔ́ndéwú lɛ wɛ nyí Gaussian noise, pink noise, kpodo Laplacian noise kpo. Enyi mi ɖɔ xó ɔ gbló ada ɔ, samples lɛ nɔ hɛn information e ma kúnkplá azɔ̌ e wa wɛ mi ɖè é ǎ. Mi sixu mɔ information enɛ lɛ di noise. Noise elɔ lɛ sixu zɔ́n bɔ classification ɔ ma na nyɔ́ ǎ. (Soft thresholding nyí nǔ taji ɖé ɖò signal denoising algorithms gègě mɛ.)
Ði kpɔ́ndéwú ɔ, mi bo lin tamɛ dó xóɖiɖɔ ɖò ali tó wu. Xó ɔ sixu hɛn hun gbe kpo keke afɔ tɔn gbe kpo. Mi sixu zán speech recognition dó signals elɔ lɛ jí. Gbe e ɖò gudo lɛ é na wa nǔ dó nǔ e mi na mɔ lɛ é wu dandan. Sín deep learning sín nukún mɛ ɔ, deep neural network ɔ ɖó na ɖe features e nyí hun gbe kpo keke gbe kpo lɛ é sín mɛ. Nǔ ɖiɖe sín mɛ elɔ nɔ zɔ́n bɔ features enɛ lɛ ma nɔ wa nǔ dó speech recognition ɔ sín results wu ǎ.
Nǔ wegɔ ɔ, noise nɔ gbɔn vo nú sample ɖokpo ɖokpo. Vogbingbɔn elɔ nɔ tíìn ɖò dataset ɖokpo ɔ lɔ mɛ. (Vogbingbɔn elɔ ɖó kancica xá attention mechanisms. Mi bo sɔ́ image dataset ɖé dó sɔ́ kpɔ́ndéwú. Fí e nǔ e ba wɛ mi ɖè ɔ ɖè ɖò image ɖokpo ɖokpo mɛ é sixu gbɔn vo. Attention mechanisms sixu sɔ́ ayi ɖó fí tawun e nǔ ɔ ɖè é ɖò image ɖokpo ɖokpo mɛ.)
Ði kpɔ́ndéwú ɔ, mi bo sɔ́ classifier e nɔ tuùn avun kpo mǔ kpo é ɖé, bɔ mi ɖó images 5 e è nɔ ylɔ ɖɔ “dog” (avun).
- Image 1 sixu hɛn avun kpo xògligò (mouse) kpo.
- Image 2 sixu hɛn avun kpo kpakpa (goose) kpo.
- Image 3 sixu hɛn avun kpo koklo (chicken) kpo.
- Image 4 sixu hɛn avun kpo kətɛ́ (donkey) kpo.
- Image 5 sixu hɛn avun kpo kpɛvi (duck) kpo.
Hwenu e training ɔ ɖò yiyi wɛ é ɔ, nǔ e ma kúnkplá avun ǎ lɛ é na d’ahwan classifier ɔ. Nǔ elɔ lɛ wɛ nyí xògligò, kpakpa, koklo, kətɛ́, kpodo kpɛvi kpan. Nǔ wiwa elɔ nɔ zɔ́n bɔ classification accuracy ɔ nɔ yi dò. Enyi mi sixu tuùn nǔ elɔ lɛ ɔ, mi sixu ɖe features e nyí yetɔn lɛ é sín mɛ. Gbɔn mɔ ɔ, mi sixu sɔ́ accuracy e classifier ɔ ɖó é yi jǐ.
2. Soft Thresholding
Soft thresholding nyí nǔ taji ɖé ɖò signal denoising algorithms gègě mɛ. Enyi absolute values e features lɛ ɖó é hwe hú threshold ɖé ɔ, algorithm ɔ nɔ ɖe features enɛ lɛ sín mɛ. Enyi absolute values e features lɛ ɖó é ka hú threshold ɔ ɔ, algorithm ɔ nɔ dɔn features enɛ lɛ yì zero gudo (shrinks features towards zero). Researchers lɛ sixu zán formula elɔ dó wa soft thresholding:
\[y = \begin{cases} x - \tau & x > \tau \\ 0 & -\tau \le x \le \tau \\ x + \tau & x < -\tau \end{cases}\]Derivative e soft thresholding output ɖó nú input ɔ wɛ nyí:
\[\frac{\partial y}{\partial x} = \begin{cases} 1 & x > \tau \\ 0 & -\tau \le x \le \tau \\ 1 & x < -\tau \end{cases}\]Formula e ɖò jǐ é xlɛ́ ɖɔ derivative e soft thresholding ɖó é nyí 1 kabǐ 0. Nǔ wiwa elɔ ɖibla cí ReLU activation function tɔn ɖɔhun. Enɛ wu ɔ, soft thresholding sixu ɖe gradient vanishing kpo gradient exploding kpo sín owǔ kpò ɖò deep learning algorithms lɛ mɛ.
Ðò soft thresholding function mɛ ɔ, threshold e mi na zán é ɖó na sɔgbe xá nǔ wè. Nukɔn tɔn, threshold ɔ ɖó na nyí nǔ xwè (positive number). Wegɔ, threshold ɔ ma ɖó na hú input signal ɔ sín maximum value ǎ. Enyi é ma nyí mɔ ǎ ɔ, output ɔ bǐ na huzu zero.
Gɔ́ na ɔ, é na nyɔ́ hú ɖɔ threshold ɔ ni lɛ́ sɔgbe xá nǔ atɔngɔ elɔ. Sample ɖokpo ɖokpo ɖó na ɖó threshold éɖesunɔ tɔn sɔgbe xá noise nabi e ɖ’emɛ é.
Nǔ e wu wɛ é nyí ɖɔ noise nɔ gbɔn vo nú samples lɛ. Ði kpɔ́ndéwú ɔ, Ðò dataset ɖokpo ɔ mɛ ɔ, Sample A sixu hɛn noise kpɛɖé, bɔ Sample B hɛn noise gègě. Ðò ninɔmɛ elɔ mɛ ɔ, Sample A ɖó na zán threshold kpɛví hwenu e è ɖò soft thresholding wa wɛ é. Bɔ Sample B ɖó na zán threshold ɖaxó. Ðò deep neural networks mɛ ɔ, features elɔ lɛ kpo thresholds lɛ kpo sixu gú tinmɛ yetɔn tawun tawun ɔ. Amɔ, nǔ taji ɔ kún huzu ó. Enɛ wɛ nyí ɖɔ, sample ɖokpo ɖokpo ɖó na ɖó threshold éɖesunɔ tɔn. Noise nabi e ɖò sample ɔ mɛ é wɛ na ɖɔ threshold e è na zán é.
3. Attention Mechanism
Researchers lɛ nɔ mɔ nukúnnú jɛ attention mechanisms mɛ bɔbɔ ɖò computer vision sín akpáxwé. Gbɛtɔ́ kpo kanlin kpo sín nukún nɔ zán attention dó tuùn nǔ e ba wɛ ye ɖè é gbɔn fɔtɔ́ linlin gbɔn fí bǐ gblamɛ. Enɛ gudo ɔ, nukún ɔ nɔ sɔ́ ayi (focus attention) ɖó nǔ tawun ɔ jí. Nǔ wiwa elɔ nɔ zɔ́n bɔ system ɔ nɔ mɔ nǔ jɛ nǔ lɛ mɛ ganji. Ðò hwe ɖokpo ɔ nu ɔ, system ɔ nɔ sɔ́ nǔ e ma ɖò taji ǎ lɛ é hwe. Nú a ba na tuùn nǔ dogɔ ɔ, a sixu xà wema e ɖɔ xó dó attention mechanisms wu lɛ é.
Squeeze-and-Excitation Network (SENet) nyí deep learning sín ali yɔyɔ̌ ɖé e nɔ zán attention mechanisms é. Ðò samples vovo lɛ mɛ ɔ, feature channels vovo lɛ nɔ w’azɔ̌ gbɔn vo nú classification task ɔ. SENet nɔ zán sub-network kpɛví ɖé dó mɔ weights (gàn) ɖé lɛ. Enɛ gudo ɔ, SENet nɔ sɔ́ weights elɔ lɛ dó mɔ features e ɖò channels lɛ mɛ é. Nǔ wiwa elɔ nɔ huzu features e ɖò channel ɖokpo ɖokpo mɛ lɛ é sín agbɔ̌n (magnitude). Mi sixu mɔ nǔ wiwa elɔ di attention sín agbɔ̌n vovo sísɔ́ ɖó feature channels vovo lɛ jí.
Ðò ali elɔ mɛ ɔ, sample ɖokpo ɖokpo nɔ ɖó weights éɖesunɔ tɔn. Enɛ wɛ nyí ɖɔ, weights e sample wè ɖebǔ ɖó é kún nyí nǔ ɖokpo ɔ ó. Ðò SENet mɛ ɔ, ali tawun e nu è nɔ gbɔn dó mɔ weights lɛ é wɛ nyí: “Global Pooling → Fully Connected Layer → ReLU Function → Fully Connected Layer → Sigmoid Function.”
4. Soft Thresholding with Deep Attention Mechanism
Deep Residual Shrinkage Network nɔ zán SENet sub-network sín wunmɛ. Network ɔ nɔ zán wunmɛ elɔ dó wa soft thresholding ɖò deep attention mechanism glɔ. Sub-network ɔ (e mi sixu mɔ ɖò gbà vɔvɔ ɔ mɛ é) nɔ kplɔ́n thresholds ɖé lɛ (Learn a set of thresholds). Enɛ gudo ɔ, network ɔ nɔ zán thresholds elɔ lɛ dó wa soft thresholding nú feature channel ɖokpo ɖokpo.
Ðò sub-network elɔ mɛ ɔ, system ɔ nɔ bɛ́ gbɔn absolute values e features e ɖò input feature map mɛ lɛ é bǐ tɔn kunkun jí. Enɛ gudo ɔ, system ɔ nɔ wa global average pooling bo nɔ mɔ feature ɖokpo, e mi na ylɔ ɖɔ A. Ðò ali wegɔ ɔ mɛ ɔ, system ɔ nɔ sɔ́ feature map ɔ dó fully connected network kpɛví ɖé mɛ ɖò global average pooling gudo. Fully connected network elɔ nɔ zán Sigmoid function ɖò layer gudo tɔn ɔ mɛ. Function elɔ nɔ sɔ́ output ɔ dó nǔ e ɖò 0 kpo 1 kpo tɛntin é mɛ. Nǔ wiwa elɔ nɔ na mi nǔ e mi nɔ ylɔ ɖɔ coefficient α. Mi nɔ ylɔ threshold gudo tɔn ɔ ɖɔ α × A. Enɛ wu ɔ, threshold ɔ nyí nǔ wè sín titsɔ. Nǔ ɖokpo ɖò 0 kpo 1 kpo tɛntin. Nǔ wegɔ ɔ wɛ nyí average nú absolute values e feature map ɔ ɖó é. Ali elɔ nɔ zɔ́n bɔ threshold ɔ nɔ nyí nǔ xwè (positive). É nɔ lɛ́ zɔ́n bɔ threshold ɔ ma nɔ d’agba dín ǎ.
Gɔ́ na ɔ, samples vovo nɔ na thresholds vovo. Enɛ wu ɔ, mi sixu mɔ nǔ wiwa elɔ di attention mechanism taji ɖé. Mechanism ɔ nɔ tuùn features e ma ɖò taji ǎ nú azɔ̌ e wa wɛ mi ɖè é. Mechanism ɔ nɔ zán convolutional layers wè dó huzu features elɔ lɛ dó nǔ e sẹpɔ zero lɛ é. Enɛ gudo ɔ, mechanism ɔ nɔ zán soft thresholding dó sɔ́ features elɔ lɛ dó huzu zero. Kabǐ ɖɔ, mechanism ɔ nɔ tuùn features e ɖò taji nú azɔ̌ e wa wɛ mi ɖè é. Mechanism ɔ nɔ zán convolutional layers wè dó huzu features elɔ lɛ dó nǔ e lín dó zero lɛ é. Gudo tɔn ɔ, mechanism ɔ nɔ hɛn features elɔ lɛ ɖó te.
Ðò vivɔnu ɔ, mi nɔ bɛ́ basic modules gègě dó gbigbá nǔ ɔ (Stack many basic modules). Mi nɔ lɛ́ sɔ́ convolutional layers, batch normalization, activation functions, global average pooling, kpodo fully connected output layers kpan gɔ́ na. Nǔ wiwa elɔ wɛ nɔ gba Deep Residual Shrinkage Network ɔ bǐ mlémlé.
5. Generalization Capability
Deep Residual Shrinkage Network ɔ, ali ɖé wɛ nyí bo nɔ w’azɔ̌ ɖò fí bǐ nú feature learning. Nǔ e wu wɛ é nyí ɖɔ, ɖò feature learning sín azɔ̌ gègě mɛ ɔ, samples lɛ nɔ hɛn noise. Samples lɛ nɔ lɛ́ hɛn information e ma ɖò taji ǎ lɛ é. Noise elɔ lɛ kpo information e ma ɖò taji ǎ lɛ é kpo sixu zɔ́n bɔ feature learning ɔ ma na nyɔ́ ǎ. Ði kpɔ́ndéwú ɔ:
Mi bo sɔ́ image classification. Image ɖokpo sixu hɛn nǔ gègě ɖevo lɛ. Mi sixu mɔ nǔ elɔ lɛ di “noise”. Deep Residual Shrinkage Network sixu zán attention mechanism ɔ. Network ɔ nɔ ɖ’ayi “noise” elɔ lɛ wu. Enɛ gudo ɔ, network ɔ nɔ zán soft thresholding dó sɔ́ features e nyí “noise” elɔ lɛ tɔn é dó huzu zero. Nǔ wiwa elɔ sixu zɔ́n bɔ image classification ɔ na nyɔ́ hú ɖé.
Mi bo sɔ́ speech recognition. Ðò taji ɔ, ɖò fí e noise sù ɖè lɛ é, ɖi ali tó kabǐ factory mɛ. Deep Residual Shrinkage Network sixu zɔ́n bɔ speech recognition ɔ na nyɔ́. Kabǐ ɖɔ, network ɔ na ali ɖé mi. Ali elɔ sixu zɔ́n bɔ speech recognition ɔ na pɔ́n te.
Reference
Minghang Zhao, Shisheng Zhong, Xuyun Fu, Baoping Tang, Michael Pecht, Deep residual shrinkage networks for fault diagnosis, IEEE Transactions on Industrial Informatics, 2020, 16(7): 4681-4690.
https://ieeexplore.ieee.org/document/8850096
BibTeX
@article{Zhao2020,
author = {Minghang Zhao and Shisheng Zhong and Xuyun Fu and Baoping Tang and Michael Pecht},
title = {Deep Residual Shrinkage Networks for Fault Diagnosis},
journal = {IEEE Transactions on Industrial Informatics},
year = {2020},
volume = {16},
number = {7},
pages = {4681-4690},
doi = {10.1109/TII.2019.2943898}
}
Nǔ e wema elɔ wa ɖò gbɛ ɔ mɛ é (Academic Impact)
Wema elɔ ko mɔ citations hú 1400 ɖò Google Scholar jí.
Sɔgbe xá nǔ e è xà lɛ é ɔ, researchers lɛ ko zán Deep Residual Shrinkage Network (DRSN) ɖò wema/nǔbaɖoji (publications/studies) hú 1000 mɛ. Azɔ̌ elɔ lɛ gba kpe fí gègě. Fí elɔ lɛ wɛ nyí mechanical engineering, electrical power, vision, healthcare, speech, text, radar, kpodo remote sensing kpan.