Deep Residual Shrinkage Network: Indlela ye-Artificial Intelligence yee-Data ezine-High Noise

I-Deep Residual Shrinkage Network yinguqulelo ephuculweyo ye-Deep Residual Network. Ngokwesiseko, kukudityaniswa kwe-Deep Residual Network, ii-attention mechanisms, kunye nee-soft thresholding functions.

Nge-extent ethile, indlela esebenza ngayo i-Deep Residual Shrinkage Network ingaqondwa ngolu hlobo: isebenzisa ii-attention mechanisms ukuchonga ii-features ezingabalulekanga kwaye isebenzisa ii-soft thresholding functions ukuziseti ku-zero; kwelinye icala, ichonga ii-features ezibalulekanga kwaye iyazigcina. Le nkqubo yomeleza isakhono se-deep neural network sokukhupha ii-useful features kwiisignali ezine-noise.

1. Research Motivation (Isizathu sophando)

Kuqala, xa si-classifisha ii-sampuli, ubukho be-noise—njenge-Gaussian noise, i-pink noise, kunye ne-Laplacian noise—abunakuphepheka. Ngokubanzi, ii-sampuli zihlala zinolwazi (information) olungahambelaniyo nomsebenzi we-classification wangoku, nalo olungathathwa njenge-noise. Le noise inokuchaphazela kakubi indlela e-perfoma ngayo i-classification. (I-Soft thresholding linyathelo eliphambili kwii-algorithms ezininzi ze-signal denoising.)

Umzekelo, ngexesha lencoko ecaleni kwendlela, isandi sinokuxutywa nezandi zeempondo zeemoto kunye namavili. Xa kusenziwa i-speech recognition kwezi signali, iziphumo ziya kuchaphazeleka ezi zandi zangasemva (background sounds). Ngokwembono ye-deep learning, ii-features ezihambelana neempondo kunye namavili kufuneka zisuswe ngaphakathi kwi-deep neural network ukuthintela ukuba zingachaphazeli iziphumo ze-speech recognition.

Okwesibini, nakwi-dataset enye, umthamo we-noise uhlala uhlukile kwi-sampuli nganye. (Oku kunokufana okuthile nee-attention mechanisms; ukuthatha i-image dataset njengomzekelo, indawo ekuyo i-target object inokwahluka kwii-images ezahlukeneyo, kwaye ii-attention mechanisms zinokugxila kwindawo ethile ye-target object kwi-image nganye.)

Umzekelo, xa uqeqesha (training) i-cat-and-dog classifier, cinga ngee-images ezintlanu ezileibhulwe njenge-“dog” (inja). I-image yokuqala inokuba nenja kunye nempuku, eyesibini inja kunye nerhanisi, eyesithathu inja kunye nenkukhu, eyesine inja kunye nedonki, kwaye eyesihlanu inja kunye nedada. Ngexesha le-training, i-classifier ngokuqinisekileyo iya kuphazamiseka zizinto ezingafunekiyo ezifana neempuku, amarhanisi, iinkukhu, iidonki, kunye namadada, okukhokelela ekuncipheni kwe-classification accuracy. Ukuba singazichonga ezi zinto zingabalulekanga—iimpuku, amarhanisi, iinkukhu, iidonki, kunye namadada—kwaye sisuse ii-features zazo ezihambelanayo, kunokwenzeka ukuphucula i-accuracy ye-cat-and-dog classifier.

2. Soft Thresholding

I-Soft thresholding linyathelo elingundoqo kwii-algorithms ezininzi ze-signal denoising. Isusa ii-features ezinee-absolute values ezingaphantsi kwe-threshold ethile kwaye i-shrinka (shrink) ii-features ezinee-absolute values ezingaphezulu kwale threshold zisingise ku-zero. Ingaphunyezwa kusetyenziswa le fomyula ilandelayo:

\[y = \begin{cases} x - \tau & x > \tau \\ 0 & -\tau \le x \le \tau \\ x + \tau & x < -\tau \end{cases}\]

I-derivative ye-soft thresholding output ngokubhekiselele kwi-input yile:

\[\frac{\partial y}{\partial x} = \begin{cases} 1 & x > \tau \\ 0 & -\tau \le x \le \tau \\ 1 & x < -\tau \end{cases}\]

Njengoko kubonisiwe ngasentla, i-derivative ye-soft thresholding yi-1 okanye ngu-0. Le propathiiyafana naleyo ye-ReLU activation function. Ke ngoko, i-soft thresholding inokunciphisa umngcipheko wokuba ii-algorithms ze-deep learning zidibane ne-gradient vanishing kunye ne-gradient exploding.

Kwi-soft thresholding function, ukusetwa kwe-threshold kumele kwanelise imiqathango emibini: kuqala, i-threshold mayibe linani elingu-positive; okwesibini, i-threshold ayinako ukuba nkulu kune-maximum value ye-input signal, kungenjalo i-output izakuba ngu-zero yonke.

Ukongeza, kungcono ukuba i-threshold yanelise umqathango wesithathu: i-sampuli nganye kufuneka ibe ne-threshold yayo ezimeleyo ngokusekwe kumthamo wayo we-noise.

Oko kubangelwa kukuba umthamo we-noise uhlala uhlukile phakathi kwee-sampuli. Umzekelo, kuqhelekile kwi-dataset enye ukuba i-Sample A ibe ne-noise encinci logama i-Sample B ine-noise eninzi. Kule meko, xa kusenziwa i-soft thresholding kwi-denoising algorithm, i-Sample A kufuneka isebenzise i-threshold encinci, ngelixa i-Sample B kufuneka isebenzise i-threshold enkulu. Nangona ezi features kunye nee-thresholds ziphulukana neentsingiselo zazo ze-physics ezicacileyo kwii-deep neural networks, i-logic esisiseko ihlala ifana. Ngamanye amazwi, i-sampuli nganye kufuneka ibe ne-threshold yayo ezimeleyo ebonwe ngumthamo wayo we-noise.

3. Attention Mechanism

Ii-Attention mechanisms kulula ukuziqaonda kwicandelo le-computer vision. Iinkqubo zokubona zezilwanyana zinokwahlula ii-targets ngokukhawuleza ziskena indawo yonke, emva koko zigxininise i-attention kwi-target object ukukhupha iinkcukacha ezininzi ngelixa zicinezela ulwazi olungabalulekanga. Ngolwazi oluthe vetshe, nceda ujonge uncwadi olumalunga ne-attention mechanisms.

I-Squeeze-and-Excitation Network (SENet) imele indlela entsha ye-deep learning esebenzisa ii-attention mechanisms. Kwii-sampuli ezahlukeneyo, igalelo lee-feature channels ezahlukeneyo kwi-classification task lihlala lihluka. I-SENet isebenzisa i-sub-network encinci ukufumana iseti yee-weights (nto leyo ebizwa ngokuba yi-“Learn a set of weights”) kwaye emva koko iphinda-phinda ezi weights ngee-features zeetshaneli ezifanelekileyo (oku kubizwa ngokuba yi-“Apply weighting to each feature channel”) ukulungisa ubungakanani (magnitude) bee-features kwitshaneli nganye. Le nkqubo inokujongwa njengokusetyenziswa kwamanqanaba ahlukeneyo e-attention kwii-feature channels ezahlukeneyo.

Squeeze-and-Excitation Network

Kule ndlela, i-sampuli nganye ineseti yayo ezimeleyo yee-weights. Ngamanye amazwi, ii-weights zazo naziphi na ii-sampuli ezimbini zahlukile. Kwi-SENet, indlela ethile yokufumana ii-weights yile “Global Pooling → Fully Connected Layer → ReLU Function → Fully Connected Layer → Sigmoid Function.”

Squeeze-and-Excitation Network

4. Soft Thresholding with Deep Attention Mechanism

I-Deep Residual Shrinkage Network ifumana inkuthazo kwi-SENet sub-network structure esele ikhankanyiwe ngasentla ukuze iphumeze i-soft thresholding phantsi kwe-deep attention mechanism. Nge-sub-network (eboniswe kwibhokisi ebomvu), iseti yee-thresholds ingafundwa (“Learn a set of thresholds”) ukuze ku-aplaywe (apply) i-soft thresholding kwi-feature channel nganye.

Deep Residual Shrinkage Network

Kule sub-network, ii-absolute values zazo zonke ii-features kwi-input feature map zibalwa kuqala. Emva koko, nge-global average pooling kunye ne-averaging, kufunyanwa i-feature, ebizwa ngokuba ngu-A. Kwelinye icala, i-feature map emva kwe-global average pooling ingeniswa kwi-fully connected network encinci. Le fully connected network isebenzisa i-Sigmoid function njenge-layer yayo yokugqibela ukulungisa (normalize) i-output phakathi kuka-0 no-1, inika i-coefficient ebizwa ngokuba ngu-α. I-threshold yokugqibela ingabhalwa njengo-α × A. Ke ngoko, i-threshold sisiphumo senani phakathi kuka-0 no-1 kunye ne-average yee-absolute values ze-feature map. Le ndlela iqinisekisa ukuba i-threshold ayikho positive kuphela kodwa ayinkulwana kakhulu.

Ukongeza, ii-sampuli ezahlukeneyo zivelisa ii-thresholds ezahlukeneyo. Ngenxa yoko, nge-extent ethile, oku kunokutolikwa njenge-attention mechanism ekhethekileyo: ichonga ii-features ezingahambelaniyo nomsebenzi wangoku, iziguqulele kumaxabiso asondele ku-zero nge-convolutional layers ezimbini, kwaye iziseti ku-zero isebenzisa i- “Soft thresholding”; okanye, ichonga ii-features ezihambelana nomsebenzi wangoku, iziguqulele kumaxabiso akude no-zero nge-convolutional layers ezimbini, kwaye izigcine.

Okokugqibela, ngokupakisha inani elithile lee-basic modules (“Stack many basic modules”) kunye nee-convolutional layers, i-batch normalization, ii-activation functions, i-global average pooling, kunye nee-fully connected output layers, i-Deep Residual Shrinkage Network epheleleyo iyakhiwa. Kufuneka kuqatshelwe ukuba kukho i-“Identity path” kunye ne-“Weighting” kwi-architecture.

Deep Residual Shrinkage Network

5. Generalization Capability

I-Deep Residual Shrinkage Network, enyanisweni, yindlela ye-general feature learning. Oku kubangelwa kukuba, kwimisebenzi emininzi ye-feature learning, ii-sampuli zihlala zinayo i-noise kunye nolwazi olungabalulekanga. Le noise kunye nolwazi olungabalulekanga kunokuchaphazela indlela e-perfoma ngayo i-feature learning. Umzekelo:

Kwi-image classification, ukuba i-image iqulethe izinto ezininzi ezingezizo, ezi zinto zinokuqondwa njenge-“noise.” I-Deep Residual Shrinkage Network ingakwazi ukusebenzisa i-attention mechanism ukuqaphela le “noise” kwaye emva koko isebenzise i-soft thresholding ukuseta ii-features ezihambelana nale “noise” zibe ngu-zero, nto leyo enokuphucula i-image classification accuracy.

Kwi-speech recognition, ngakumbi kwiimeko ezine-noise eninzi njengeencoko ecaleni kwendlela okanye ngaphakathi kwi-factory workshop, i-Deep Residual Shrinkage Network inokuphucula i-speech recognition accuracy, okanye ubuncinci, inike i-methodology enakho ukuphucula i-speech recognition accuracy.

Reference

Minghang Zhao, Shisheng Zhong, Xuyun Fu, Baoping Tang, Michael Pecht, Deep residual shrinkage networks for fault diagnosis, IEEE Transactions on Industrial Informatics, 2020, 16(7): 4681-4690.

https://ieeexplore.ieee.org/document/8850096

BibTeX

@article{Zhao2020,
  author    = {Minghang Zhao and Shisheng Zhong and Xuyun Fu and Baoping Tang and Michael Pecht},
  title     = {Deep Residual Shrinkage Networks for Fault Diagnosis},
  journal   = {IEEE Transactions on Industrial Informatics},
  year      = {2020},
  volume    = {16},
  number    = {7},
  pages     = {4681-4690},
  doi       = {10.1109/TII.2019.2943898}
}

Academic Impact

Eli phepha lifumene ama-citations angaphezu kwe-1400 ku-Google Scholar.

Ngokusekwe kwistatistiki ezingaphelelanga, i-Deep Residual Shrinkage Network (DRSN) isetyenziswe ngokuthe ngqo okanye yalungiswa yaze yasetyenziswa kwiimpapasho/uphando olungaphezu kwe-1000 kwiinkalo ezininzi, ezibandakanya i-mechanical engineering, i-electrical power, i-vision, i-healthcare, i-speech, i-text, i-radar, kunye ne-remote sensing.