Deep Residual Shrinkage Network: Highly Noisy Data ke liye ek Artificial Intelligence Method

Deep Residual Shrinkage Network, Deep Residual Network ka hi ek sudhra hua roop (improved variant) hai. Asal ma, ye Deep Residual Network, attention mechanisms, aur soft thresholding functions ka ek integration hai.

Thoda aasan bhasha ma samjho, toh Deep Residual Shrinkage Network ka working principle aise samjha ja sakat hai: ye attention mechanisms ka use karke bekaar (unimportant) features ko pahchanat hai aur soft thresholding functions laga ke unka value zero kar dewat hai; wahin dusri taraf, ye important features ko pahchan ke unhe bacha ke rakhat hai. Is process se deep neural network ki noisy signals ma se kaam ke useful features nikalne ki taakat badh jaat hai.

1. Research Motivation

Sabse pahile, jab hum samples ko classify karat hai, toh noise—jaise Gaussian noise, pink noise, aur Laplacian noise—ka hona pakka rahat hai. Aur agar khul ke bolein, toh samples ma aksar aisan jaankari bhi milat hai jo abhi ke classification task ke liye bekaar hai, isko bhi hum noise man sakat hai. Ye noise classification performance par bura asar daal sakat hai. (Soft thresholding bahut saare signal denoising algorithms ma ek key step hot hai.)

Udaharan ke liye, agar sadak kinare baat-cheet (conversation) ho rahi hai, toh audio ma gaadi ke horn aur pahiye ki awaaz mil sakat hai. Jab in signals pe speech recognition lagaib, toh natija in background sounds se jaroor affect hoi. Deep learning ke najariye se dekhein, toh horn aur pahiye wale features ko deep neural network ke andar hi khatam (eliminate) kar dena chahiye taaki wo speech recognition ke result ko gadbad na kare.

Dusri baat, ek hi dataset ma bhi, har sample ka noise content aksar alag-alag hot hai. (Ye attention mechanisms jaisa hi hai; agar ek image dataset ka example lein, toh target object ki location har image ma alag ho sakat hai, aur attention mechanisms har image ma target object ki specific location pe focus kar sakat hai.)

Jaise maan lo hum ek cat-and-dog classifier train karat hai. “Dog” label wali 5 photo lo. Pahili photo ma kutta aur chuha (mouse) hai, dusri ma kutta aur hans (goose) hai, teesri ma kutta aur murgi (chicken), chauthi ma kutta aur gadha (donkey), aur panchvi ma kutta aur batakh (duck) hai. Training ke time, classifier in bekaar cheezon—jaise chuha, hans, murgi, gadha, aur batakh—se confuse ho sakat hai, jisse classification accuracy gir sakat hai. Agar hum in bekaar cheezon ko pahchan lein aur inke corresponding features ko hata dein (eliminate), toh cat-and-dog classifier ki accuracy badh sakti hai.

2. Soft Thresholding

Soft thresholding, bahut saare signal denoising algorithms ka core step hai. Ye un features ko eliminate kar dewat hai jinki absolute values ek particular threshold se kam hoti hai, aur jin features ki absolute values is threshold se jyada hoti hai, unko zero ki taraf shrink kar dewat hai. Isko niche diye gaye formula se implement kiya ja sakat hai:

\[y = \begin{cases} x - \tau & x > \tau \\ 0 & -\tau \le x \le \tau \\ x + \tau & x < -\tau \end{cases}\]

Soft thresholding ke output ka derivative, input ke hisaab se ye hai:

\[\frac{\partial y}{\partial x} = \begin{cases} 1 & x > \tau \\ 0 & -\tau \le x \le \tau \\ 1 & x < -\tau \end{cases}\]

Jaisa upar dikha, soft thresholding ka derivative ya toh 1 hai ya 0. Ye property bilkul ReLU activation function jaisi hai. Isliye, soft thresholding use karne se deep learning algorithms ma gradient vanishing aur gradient exploding ka khatra bhi kam ho jaat hai.

Soft thresholding function ma, threshold set karte waqt do shartein maani padti hai: Pahila, threshold ek positive number hona chahiye; Dusra, threshold input signal ki maximum value se bada nahi hona chahiye, nahi toh output pura zero ho jayega.

Iske alawa, threshold ke liye ek teesri shart bhi honi chahiye: har sample ka apna alag independent threshold hona chahiye jo uske noise content pe depend kare.

Aisa isliye kyunki samples ma noise content aksar alag-alag hot hai. Jaise aksar aisa hot hai ki ek hi dataset ma Sample A ma kam noise hai aur Sample B ma jyada noise hai. Aisi haalat ma, jab denoising algorithm ma soft thresholding karat hai, toh Sample A ke liye chhota threshold use karna chahiye, aur Sample B ke liye bada threshold use karna chahiye. Bhale hi deep neural networks ma in features aur thresholds ka koi saaf physical definition nahi bachti, par basic logic wahi hai. Matlab, har sample ka apna independent threshold hona chahiye jo uske specific noise content se tay ho.

3. Attention Mechanism

Computer vision ki field ma Attention mechanisms samajhna kaafi aasan hai. Janwaro ka visual system pura area tezi se scan kar let hai aur target object ko pahchan let hai, fir wo apna attention target object par focus karat hai taaki jyada details mil sake, aur bekaar information (irrelevant info) ko ignore kar dewat hai. Jyada details ke liye aap attention mechanisms wali literature padh sakat ho.

Squeeze-and-Excitation Network (SENet) ek relatively naya deep learning method hai jo attention mechanisms ka use karat hai. Alag-alag samples ma, alag-alag feature channels ka classification task ma contribution aksar alag hot hai. SENet ek chhote sub-network ka use karke weights ka ek set (Learn a set of weights) prapt karat hai aur fir in weights ko unke channels ke features se multiply karat hai. Is process ko aise dekha ja sakat hai ki hum alag-alag feature channels pe alag-alag level ki weighting (Apply weighting to each feature channel) laga rahe hai.

Squeeze-and-Excitation Network

Is tarike ma, har sample ke paas weights ka apna independent set hot hai. Matlab, koi bhi do samples ke weights alag-alag honge. SENet ma, weights prapt karne ka specific path ye hai: “Global Pooling → Fully Connected Layer → ReLU Function → Fully Connected Layer → Sigmoid Function.”

Squeeze-and-Excitation Network

4. Soft Thresholding with Deep Attention Mechanism

Deep Residual Shrinkage Network upar bataye gaye SENet sub-network structure se idea leke deep attention mechanism ke andar soft thresholding implement karat hai. Ek sub-network (jo laal box/red box ma dikhaya gawa hai) ke jariye, hum thresholds ka ek set seekh sakat hai (Learn a set of thresholds) taaki har feature channel pe soft thresholding laga sakein.

Deep Residual Shrinkage Network

Is sub-network ma, sabse pahile input feature map ke saare features ki absolute values calculate ki jaat hai. Fir, global average pooling aur averaging karke ek feature milat hai, jise hum A kahenge. Dusre raste ma, global average pooling ke baad jo feature map milat hai, wo ek chhote fully connected network ma daala jaat hai. Is fully connected network ke last ma Sigmoid function hota hai jo output ko 0 aur 1 ke beech normalize kar dewat hai, jise hum coefficient α kahenge. Final threshold ko α × A ke roop ma likha ja sakat hai. Isliye, threshold ek 0 se 1 ke beech ka number aur feature map ki absolute values ke average ka guna (product) hai. Ye method pakka karat hai ki threshold na sirf positive ho, balki bahut jyada bada bhi na ho.

Aur toh aur, alag-alag samples ke liye alag-alag thresholds nikal ke aawat hai. Natijan, isko ek khaas tarah ka attention mechanism maan sakat hai: ye un features ko identify karat hai jo current task ke liye bekaar hai, unko do convolutional layers ke jariye zero ke kareeb values ma badal dewat hai, aur soft thresholding laga ke unhe zero kar dewat hai; ya fir, ye un features ko identify karat hai jo current task ke liye jaruri hai, unko zero se door wali values ma badal dewat hai, aur unhe preserve (bacha) kar let hai.

Aakhiri ma, kuch basic modules (Stack many basic modules) ke sath convolutional layers, batch normalization, activation functions, global average pooling, aur fully connected output layers ko stack karke, pura Deep Residual Shrinkage Network ban jaat hai. Isme Identity path ka bhi use hot hai features ko aage badhane ke liye.

Deep Residual Shrinkage Network

5. Generalization Capability

Deep Residual Shrinkage Network asal ma ek general feature learning method hai. Kyunki bahut saare feature learning tasks ma, samples ma thoda bahut noise ya irrelevant information hoti hi hai. Ye noise aur irrelevant information feature learning ki performance pe asar daal sakat hai. Jaise ki:

Image classification ma, agar photo ma bahut saare aur bhi objects hai, toh in objects ko “noise” samjha ja sakat hai. Deep Residual Shrinkage Network shayad attention mechanism ka use karke is “noise” ko notice kar le aur fir soft thresholding laga ke is “noise” wale features ko zero kar de, jisse image classification accuracy badh sakat hai.

Speech recognition ma, khaas kar ke shor-sharabe wale mahol ma jaise sadak kinare baat-cheet ya factory workshop ke andar, Deep Residual Shrinkage Network speech recognition accuracy badha sakat hai, ya kam se kam ek aisan methodology dewat hai jo accuracy badha sake.

Reference

Minghang Zhao, Shisheng Zhong, Xuyun Fu, Baoping Tang, Michael Pecht, Deep residual shrinkage networks for fault diagnosis, IEEE Transactions on Industrial Informatics, 2020, 16(7): 4681-4690.

https://ieeexplore.ieee.org/document/8850096

BibTeX

@article{Zhao2020,
  author    = {Minghang Zhao and Shisheng Zhong and Xuyun Fu and Baoping Tang and Michael Pecht},
  title     = {Deep Residual Shrinkage Networks for Fault Diagnosis},
  journal   = {IEEE Transactions on Industrial Informatics},
  year      = {2020},
  volume    = {16},
  number    = {7},
  pages     = {4681-4690},
  doi       = {10.1109/TII.2019.2943898}
}

Academic Impact

Is paper ko Google Scholar par 1400 se jyada citations mile hai.

Adhure statistics ke hisaab se, Deep Residual Shrinkage Network (DRSN) ko mechanical engineering, electrical power, vision, healthcare, speech, text, radar, aur remote sensing jaise dher saare fields ma 1000 se jyada publications/studies ma direct apply kiya gawa hai ya modify karke apply kiya gawa hai.