Deep Residual Shrinkage Network ezali version améliorée ya Deep Residual Network. Na mokuse, ezali integration ya Deep Residual Network, attention mechanisms, mpe soft thresholding functions.
Na ndelo moko boye, tokoki kokanga ntina ya ndenge Deep Residual Shrinkage Network esalaka boye: esalelaka attention mechanisms mpo na komona ba feature oyo ezali na ntina te mpe esalelaka soft thresholding functions mpo na kotya yango na zero; na ngambo mosusu, emonaka ba feature ya ntina mpe ebombaka yango. Process oyo epesaka deep neural network makasi ya kobimisa ba feature ya ntina longwa na ba signal oyo ezali na noise.
1. Research Motivation
Ya liboso, tango tozali kosala classification ya ba sample, kozala ya noise—lokola Gaussian noise, pink noise, pe Laplacian noise—ezali inevitable. Na ndenge ya monene, ba sample mingi ezalaka na information oyo etali te mosala ya classification oyo tozali kosala, mpe information wana ekoki kozala considerée lokola noise. Noise oyo ekoki kobebisa performance ya classification. (Soft thresholding ezali étape ya ntina mingi na ba algorithme mingi ya signal denoising.)
Na ndakisa, tango ozali kosolola pembeni ya nzela, makelele ya mituka mpe ya ba roues ekoki kokota na kati ya masolo. Tango ozali kosala speech recognition na ba signal wana, resultat ekobeba mpo na makelele (background sounds) wana. Na kotalela deep learning, ba feature oyo ezali ya makelele ya mituka pe ba roué esengeli kolongolama na kati ya deep neural network mpo ebebisa te resultat ya speech recognition.
Ya mibale, ata kati na dataset moko, quantité ya noise ekeseni na kati ya ba sample. (Oyo ezali na similarities na attention mechanisms; soki tozweli ndakisa ya dataset ya ba image, esika objet oyo tolukaka ezali ekoki kokesena na kati ya ba image, mpe attention mechanisms ekoki kotya likebi na esika exact ya objet wana.)
Na ndakisa, tango tozaliko train classifier ya pusi (cat) na mbwa (dog), soki tozwi ba image mitano oyo ezali na label “dog.” Image ya liboso ekoki kozala na dog pe mpuku (mouse), ya mibale dog pe dindon (goose), ya misato dog pe nsoso (chicken), ya minei dog pe mpunda (donkey), mpe ya mitano dog pe canard (duck). Tango ya training, classifier ekozwa interference ya banyama oyo ezali na ntina te lokola mpuku, dindon, nsoso, mpunda, pe canard, mpe yango ekokitisa accuracy. Soki tokoki komona banyama oyo ezali na ntina te—mpuku, dindon, nsoso, mpunda, pe canard—mpe kolongola ba feature na bango, tokoki kobongisa accuracy ya cat-and-dog classifier.
2. Soft Thresholding
Soft thresholding ezali core step na ba algorithme mingi ya signal denoising. Elongolaka ba feature oyo absolute value na yango ezali moke koleka threshold moko boye, mpe ekirisaka (shrinks) ba feature oyo ezali monene koleka threshold wana pene na zero. Ekoki kosalema na formule oyo:
\[y = \begin{cases} x - \tau & x > \tau \\ 0 & -\tau \le x \le \tau \\ x + \tau & x < -\tau \end{cases}\]Dérivée ya output ya soft thresholding na input ezali:
\[\frac{\partial y}{\partial x} = \begin{cases} 1 & x > \tau \\ 0 & -\tau \le x \le \tau \\ 1 & x < -\tau \end{cases}\]Ndenge tomonisi awa na likolo, dérivée ya soft thresholding ezali 1 to 0. Propriété oyo ezali ndenge moko na ReLU activation function. Yango wana, soft thresholding ekoki mpe kokitisa risque ke deep learning algorithms ekutana na gradient vanishing to gradient exploding.
Na kati ya soft thresholding function, ndenge ya ko-set threshold esengeli kotosa mibeko mibale: ya liboso, threshold esengeli kozala positif; ya mibale, threshold ekoki koleka te valeur maximale ya input signal, soki boye te output ekozala nionso zero.
Lisusu, ezali malamu ete threshold etosa mobeko ya misato: sample mokomoko esengeli kozala na threshold na yango moko independent na kotalela content ya noise na yango.
Mpo ete content ya noise ekesenaka mingi kati na ba sample. Na ndakisa, ezali mwa mingi na kati ya dataset moko, Sample A ezala na noise moke kasi Sample B ezala na noise mingi. Na cas oyo, tango tozali kosala soft thresholding na denoising algorithm, Sample A esengeli kosalela threshold ya moke, kasi Sample B esengeli kosalela threshold ya monene. Atako ba feature mpe ba threshold oyo ezalaka lisusu na definition physique ya polele te na kati ya deep neural networks, logique ya base ezali kaka ndenge moko. Elingi koloba, sample mokomoko esengeli kozala na threshold na yango moko independent oyo e-depend na content ya noise na yango.
3. Attention Mechanism
Attention mechanisms ezali mwa facile kokanga ntina na domaine ya computer vision. Ba systeme visuel ya banyama ekoki kokesenisa ba target na scan ya rapid ya area mobimba, mpe na nsima e-focus attention na objet oyo baluki mpo na kozwa ba detail mingi pe ko-supprimer ba information oyo eza na ntina te. Mpo na koyeba mingi, tanga ba literature oyo etali attention mechanisms.
Squeeze-and-Excitation Network (SENet) ezali methode ya sika ya deep learning oyo esalelaka attention mechanisms. Na kati ya ba sample ndenge na ndenge, contribution ya ba feature channel ekeseni na mosala ya classification. SENet esalelaka sub-network ya moke mpo na kozwa set ya ba weights (Learn a set of weights) mpe na nsima e-multiplier ba weights oyo na ba feature ya ba channel yango mpo na ko-adjuster magnitude ya ba feature na channel mokomoko. Process oyo ekoki kozala lokola kopesa attention (Apply weighting to each feature channel) ya niveau ekeseni na ba feature channel ndenge na ndenge.
Na approche oyo, sample nionso ezali na set ya ba weights na yango moko independent. Elingi koloba, ba weights ya ba sample mibale nionso ekeseni. Na SENet, nzela exact mpo na kozwa weights ezali “Global Pooling → Fully Connected Layer → ReLU Function → Fully Connected Layer → Sigmoid Function.”
4. Soft Thresholding with Deep Attention Mechanism
Deep Residual Shrinkage Network ezwaki idee na structure ya SENet sub-network oyo tolobeli likolo mpo na kosala soft thresholding na nse ya deep attention mechanism. Na nzela ya sub-network (oyo emonisami na cadre ya rouge), tokoki ko-learn set ya ba threshold (Learn a set of thresholds) mpo na ko-apply soft thresholding na feature channel mokomoko.
Na sub-network oyo, tobandaka na koluka absolute values ya ba feature nionso na kati ya input feature map. Na nsima, na nzela ya global average pooling mpe averaging, tozwaka feature moko, oyo tokomi A. Na nzela mosusu (Identity path), feature map nsima ya global average pooling ekendeke na kati ya small fully connected network. Fully connected network oyo esalelaka Sigmoid function lokola layer ya suka mpo na ko-normalize output kati na 0 na 1, mpe epesaka coefficient moko oyo tokomi α. Threshold ya suka ekoki kokomama lokola α × A. Yango wana, threshold ezali product ya nombre moko kati na 0 na 1 mpe average ya absolute values ya feature map. Methode oyo eza sure ke threshold eza positif mpe eza monene koleka te.
Lisusu, ba sample ekeseni epesaka ba threshold ekeseni. Mpo na yango, na ndelo moko boye, tokoki koloba ke ezali attention mechanism ya special: emonaka ba feature oyo ezali na ntina te na task ya sikoyo, ebongolaka yango na ba valeur pene na zero na nzela ya ba convolutional layer mibale, mpe etyaka yango na zero na soft thresholding; na ngambo mosusu, emonaka ba feature oyo ezali na ntina na task ya sikoyo, ebongolaka yango na ba valeur mosika na zero na nzela ya ba convolutional layer mibale, mpe ebombaka yango.
Na suka, na ko-stacker ba basic module (Stack many basic modules) elongo na ba convolutional layer, batch normalization, activation functions, global average pooling, mpe fully connected output layers, totonaka Deep Residual Shrinkage Network mobimba.
5. Generalization Capability
Deep Residual Shrinkage Network ezali, na solo, methode ya general feature learning. Ezali boye mpo, na ba task mingi ya feature learning, ba sample ezalaka na mwa noise to information oyo eza na ntina te. Noise mpe information oyo eza na ntina te ekoki kobebisa performance ya feature learning. Na ndakisa:
Na image classification, soki image moko ezali na ba objet mosusu mingi, ba objet wana ekoki kozala “noise.” Deep Residual Shrinkage Network ekoki kosalela attention mechanism mpo na komona “noise” oyo mpe na nsima kosalela soft thresholding mpo na kotya ba feature ya “noise” wana na zero, mpe yango ekoki kobongisa accuracy ya image classification.
Na speech recognition, mingi mingi na ba environnement oyo eza na noise lokola kosolola pembeni ya nzela to na kati ya usine, Deep Residual Shrinkage Network ekoki kobongisa accuracy ya speech recognition, to kopesa nzela mpo na kobongisa accuracy ya speech recognition.
Reference
Minghang Zhao, Shisheng Zhong, Xuyun Fu, Baoping Tang, Michael Pecht, Deep residual shrinkage networks for fault diagnosis, IEEE Transactions on Industrial Informatics, 2020, 16(7): 4681-4690.
https://ieeexplore.ieee.org/document/8850096
BibTeX
@article{Zhao2020,
author = {Minghang Zhao and Shisheng Zhong and Xuyun Fu and Baoping Tang and Michael Pecht},
title = {Deep Residual Shrinkage Networks for Fault Diagnosis},
journal = {IEEE Transactions on Industrial Informatics},
year = {2020},
volume = {16},
number = {7},
pages = {4681-4690},
doi = {10.1109/TII.2019.2943898}
}
Academic Impact
Paper oyo ezwa citations koleka 1400 na Google Scholar.
Na statistique oyo ezali complete te, Deep Residual Shrinkage Network (DRSN) esalelami direct to mpe ebongisami mpo na kosalelama na ba publication/étude koleka 1000 na ba domaine mingi lokola mechanical engineering, electrical power, vision, healthcare, speech, text, radar, mpe remote sensing.