Deep Residual Shrinkage Network: Waa Hab Artificial Intelligence ah oo loogu talagalay Data-da leh Noise-ka Badan

Deep Residual Shrinkage Network waa nooc la horumariyay oo ah Deep Residual Network. Asal ahaan, waa isku-dhafka Deep Residual Network, attention mechanisms, iyo soft thresholding functions.

Ilaa xad, mabda’a shaqada ee Deep Residual Shrinkage Network waxaa loo fahmi karaa sidan: waxay isticmaashaa attention mechanisms si ay u identify-gareyso features-ka aan muhiimka ahayn (unimportant features), waxayna isticmaashaa soft thresholding functions si ay uga dhigto eber (zero); dhanka kale, waxay identify-gareysaa features-ka muhiimka ah wayna haysataa (retain). Habkan wuxuu xoojinayaa awoodda deep neural network-gu u leeyahay inuu features faa’iido leh kasoo saaro signals-ka ay ku jiraan noise-ku.

1. Dhiirigelinta Cilmi-baarista (Research Motivation)

Marka hore, marka la samaynayo classifying samples, jiritaanka noise-ka—sida Gaussian noise, pink noise, iyo Laplacian noise—waa wax aan laga maarmi karin. Si guud haddii loo eego, samples-ku inta badan waxay xambaarsan yihiin macluumaad aan khusayn shaqada classification-ka ee hadda socota, taas oo sidoo kale loo fasiran karo inay tahay noise. Noise-kani wuxuu si xun u saamayn karaa waxqabadka classification-ka. (Soft thresholding waa tallaabo muhiim ah oo ku jirta algorithms badan oo ah signal denoising.)

Tusaale ahaan, inta lagu guda jiro wada sheekeysi waddada dhinaceeda ah, codka waxaa laga yaabaa inuu ku dhex qasmo dhawaaqa hoonka baabuurta iyo taayirrada. Marka lagu samaynayo speech recognition signals-kan, natiijooyinku si lama huraan ah ayay u saamaynayaan dhawaaqyadan gadaale (background sounds). Marka laga eego dhanka deep learning, features-ka u dhigma hoonka iyo taayirrada waa in lagu eliminate-gareeyaa gudaha deep neural network-ga si looga hortago inay saameeyaan natiijooyinka speech recognition-ka.

Marka labaad, xitaa isla dataset-ka dhexdiisa, xaddiga noise-ku badanaa wuu ku kala duwan yahay sample ilaa sample kale. (Tani waxay wadaagaan waxyaabo la mid ah attention mechanisms; tusaale ahaan image dataset, goobta uu ku yaal target object-gu way ku kala duwanaan kartaa sawirrada dhexdooda, attention mechanisms-kuna waxay diiradda saari karaan goobta gaarka ah ee target object-gu kaga yaal sawir kasta.)

Tusaale ahaan, marka la train-gareynayo cat-and-dog classifier, tixgeli shan sawir oo lagu calaamadiyay “dog” (ey). Sawirka koowaad waxaa laga yaabaa inuu ku jiro ey iyo jiir, kan labaad ey iyo dhaan (goose), kan saddexaad ey iyo dooro (chicken), kan afraad ey iyo dameer, kan shanaadna ey iyo boola-boolo (duck). Inta lagu jiro training-ka, classifier-ku wuxuu si lama huraan ah u la kulmi doonaa faragelin kaga timaada objects-ka aan khusayn sida jiirka, dhaanka, doorada, dameerka, iyo boola-boolada, taasoo keenaysa hoos u dhac ku yimaada classification accuracy. Haddii aan awoodno inaan identify-gareyno objects-kan aan khusayn—jiirka, dhaanka, doorada, dameerka, iyo boola-boolada—oo aan eliminate-gareyno features-kooda u dhigma, waxaa suurtagal ah inaan horumarino accuracy-ga cat-and-dog classifier-ka.

2. Soft Thresholding

Soft thresholding waa tallaabo udub-dhexaad u ah algorithms badan oo signal denoising ah. Waxay eliminate-gareysaa features-ka absolute values-koodu ka hooseeyaan threshold go’an, waxayna u shrink-gareysaa features-ka absolute values-koodu ka sarreeyaan threshold-kan dhanka eberka (zero). Waxaa lagu implement-gareyn karaa iyadoo la isticmaalayo formula-kan:

\[y = \begin{cases} x - \tau & x > \tau \\ 0 & -\tau \le x \le \tau \\ x + \tau & x < -\tau \end{cases}\]

Derivative-ka soft thresholding output ee ku aaddan input-ka waa:

\[\frac{\partial y}{\partial x} = \begin{cases} 1 & x > \tau \\ 0 & -\tau \le x \le \tau \\ 1 & x < -\tau \end{cases}\]

Sida kor ku xusan, derivative-ka soft thresholding waa 1 ama 0. Sifadani waxay la mid tahay tan ReLU activation function. Sidaa darteed, soft thresholding waxay sidoo kale yareyn kartaa halista ah in algorithms-ka deep learning ay la kulmaan gradient vanishing iyo gradient exploding.

Gudaha soft thresholding function-ka, dejinta threshold-ku waa inay buuxisaa laba shuruudood: marka hore, threshold-ku waa inuu noqdaa positive number; marka labaad, threshold-ku kama badnaan karo qiimaha ugu sarreeya (maximum value) ee input signal-ka, haddii kale output-ku wuxuu noqonayaa gebi ahaanba eber.

Intaas waxaa dheer, waxaa doorbid mudan in threshold-ku buuxiyo shuruud saddexaad: sample kasta waa inuu yeesho threshold u gaar ah oo madax-bannaan iyadoo lagu saleynayo noise content-kiisa.

Sababtu waxa weeye, noise content-ku badanaa wuu ku kala duwan yahay samples-ka dhexdooda. Tusaale ahaan, waa wax caadi ka ah isla dataset-ka dhexdiisa in Sample A uu ku jiro noise yar halka Sample B uu ku jiro noise badan. Xaaladdan, marka la samaynayo soft thresholding gudaha algorithm-ka denoising-ka, Sample A waa inuu isticmaalaa threshold yar, halka Sample B uu isticmaalo threshold weyn. In kasta oo features-kan iyo thresholds-kani ay luminayaan qeexitaankoodii jireed (physical definitions) ee tooska ahaa gudaha deep neural networks, haddana caqli-galka aasaasiga ah (basic underlying logic) waa isku mid. Si kale haddii loo dhigo, sample kasta waa inuu yeesho threshold u gaar ah oo madax-bannaan kaas oo lagu go’aamiyo noise content-kiisa gaarka ah.

3. Attention Mechanism

Attention mechanisms waa kuwo aad u fudud in laga fahmo dhanka computer vision. Nidaamyada aragga ee xayawaanku waxay kala sooci karaan targets-ka iyagoo si degdeg ah u baaraya (scanning) aagga oo dhan, kadibna diiradda (attention) saaraya target object-ga si ay u helaan faahfaahin dheeraad ah iyagoo isla markaasna caburrinaya (suppressing) macluumaadka aan khusayn. Wixii faahfaahin ah, fadlan tixraac suugaanta ku saabsan attention mechanisms.

Squeeze-and-Excitation Network (SENet) waxay matashaa hab deep learning oo cusub oo isticmaala attention mechanisms. Samples-ka kala duwan dhexdooda, kaalinta ay feature channels-ka kala duwan ku leeyihiin classification task-ga badanaa way kala duwan tahay. SENet waxay isticmaashaa sub-network yar si ay u hesho set of weights (Learn a set of weights), kadibna waxay weights-kan ku dhufataa features-ka channels-ka u dhigma si ay u hagaajiso baaxadda features-ka channel kasta. Geeddi-socodkan waxaa loo arki karaa inuu yahay Apply weighting to each feature channel (ku dabaqidda heerar kala duwan oo attention ah feature channel kasta).

Squeeze-and-Excitation Network

Habkan, sample kasta wuxuu leeyahay set of weights u gaar ah oo madax-bannaan. Si kale haddii loo dhigo, weights-ka laba samples oo kasta oo aan kala soocnayn way kala duwan yihiin. Gudaha SENet, waddada gaarka ah ee lagu helo weights-ka waa “Global Pooling → Fully Connected Layer → ReLU Function → Fully Connected Layer → Sigmoid Function.”

Squeeze-and-Excitation Network

4. Soft Thresholding oo leh Deep Attention Mechanism

Deep Residual Shrinkage Network waxay dhiirigelin ka qaadataa qaab-dhismeedka sub-network ee SENet ee kor ku xusan si ay u implement-gareyso soft thresholding oo hoos yimaada deep attention mechanism. Iyadoo la sii marayo sub-network-ga (oo lagu muujiyay sanduuqa cas), waxaa la baran karaa set of thresholds (Learn a set of thresholds) si loogu dabaqo soft thresholding feature channel kasta.

Deep Residual Shrinkage Network

Gudaha sub-network-gan, absolute values-ka dhammaan features-ka ku jira input feature map-ka ayaa marka hore la xisaabinayaa. Kadib, iyadoo la marayo global average pooling iyo celcelis (averaging), waxaa la helayaa feature, oo loo calaamadeeyay A. Waddada kale (identity path), feature map-ka kadib global average pooling waxaa la gelinayaa fully connected network yar. Fully connected network-gan wuxuu isticmaalaa Sigmoid function lakabkiisa ugu dambeeya si uu output-ka uga dhigo mid u dhexeeya 0 iyo 1, isagoo soo saaraya coefficient loo calaamadeeyay α. Threshold-ka ugu dambeeya waxaa lagu qeexi karaa α × A. Sidaa darteed, threshold-ku waa zulumadka (product) lambar u dhexeeya 0 iyo 1 iyo celceliska absolute values-ka feature map-ka. Habkani wuxuu xaqiijinayaa in threshold-ku uusan kaliya ahayn positive balse uusan sidoo kale ahayn mid aad u weyn (excessively large).

Waxaa intaa dheer, samples kala duwan waxay keenaan thresholds kala duwan. Sidaas darteed, ilaa xad, tan waxaa loo fasiran karaa inay tahay attention mechanism gaar ah: waxay identify-gareysaa features-ka aan khusayn task-ga hadda socda, waxayna u beddeshaa values ku dhow eber iyadoo sii mareysa laba convolutional layers, waxayna ka dhigtaa eber iyadoo isticmaalaysa soft thresholding; dhanka kale, waxay identify-gareysaa features-ka khuseeya task-ga hadda socda, waxayna u beddeshaa values ka fog eber iyadoo sii mareysa laba convolutional layers, wayna haysataa (preserves them).

Ugu dambeyntii, iyadoo la is dul-saarayo (stacking) tiro modules aasaasi ah (Stack many basic modules) oo ay weheliyaan convolutional layers, batch normalization, activation functions, global average pooling, iyo fully connected output layers, waxaa la dhisayaa Deep Residual Shrinkage Network oo dhammaystiran.

Deep Residual Shrinkage Network

5. Awoodda Generalization-ka (Generalization Capability)

Deep Residual Shrinkage Network, xaqiiqdii, waa hab guud oo feature learning ah. Sababtu waxa weeye, hawlo badan oo feature learning ah, samples-ku in badan ama in yar waxay xambaarsan yihiin noise iyo sidoo kale macluumaad aan khusayn. Noise-kan iyo macluumaadkan aan khusayn waxay saamayn karaan waxqabadka feature learning-ka. Tusaale ahaan:

Gudaha image classification, haddii sawirku isku mar ka kooban yahay walxo kale oo badan, walxahan waxaa loo fahmi karaa “noise.” Deep Residual Shrinkage Network waxaa laga yaabaa inay awooddo inay isticmaasho attention mechanism si ay u dareento “noise”-kan, kadibna ay isticmaasho soft thresholding si ay features-ka u dhigma “noise”-kan uga dhigto eber, taas oo suurtagal ah inay hagaajiso image classification accuracy.

Gudaha speech recognition, gaar ahaan bey’adaha ay ku badan yihiin noise-ku sida goobaha wada sheekeysiga waddada dhinaceeda ama gudaha warshadaha, Deep Residual Shrinkage Network waxaa laga yaabaa inay hagaajiso speech recognition accuracy, ama ugu yaraan, waxay bixisaa hab (methodology) awood u leh inuu hagaajiyo speech recognition accuracy.

Reference

Minghang Zhao, Shisheng Zhong, Xuyun Fu, Baoping Tang, Michael Pecht, Deep residual shrinkage networks for fault diagnosis, IEEE Transactions on Industrial Informatics, 2020, 16(7): 4681-4690.

https://ieeexplore.ieee.org/document/8850096

BibTeX

@article{Zhao2020,
  author    = {Minghang Zhao and Shisheng Zhong and Xuyun Fu and Baoping Tang and Michael Pecht},
  title     = {Deep Residual Shrinkage Networks for Fault Diagnosis},
  journal   = {IEEE Transactions on Industrial Informatics},
  year      = {2020},
  volume    = {16},
  number    = {7},
  pages     = {4681-4690},
  doi       = {10.1109/TII.2019.2943898}
}

Saamaynta Akadeemiyadeed (Academic Impact)

Warqaddan (Paper) waxay heshay in ka badan 1,400 citations oo ku saabsan Google Scholar.

Iyadoo lagu saleynayo tirakoob aan dhammaystirnayn, Deep Residual Shrinkage Network (DRSN) ayaa si toos ah loo dabaqay ama wax laga beddelay oo lagu dabaqay in ka badan 1,000 daabacaadood/daraasadood oo ku saabsan qaybo kala duwan oo ballaaran, oo ay ku jiraan mechanical engineering, electrical power, vision, healthcare, speech, text, radar, iyo remote sensing.