Deep Residual Shrinkage Network: He Tikanga Artificial Intelligence mō ngā Highly Noisy Data

Ko te Deep Residual Shrinkage Network he putanga pai ake o te Deep Residual Network. I tōna pūtake, ka tuitui te Deep Residual Shrinkage Network i te Deep Residual Network, ngā attention mechanisms, me ngā soft thresholding functions.

Ka taea e tātou te mārama ki te pūtake mahi o te Deep Residual Shrinkage Network mā tēnei huarahi. Tuatahi, ka whakamahia e te whatunga ngā attention mechanisms hei tautohu i ngā unimportant features. Kātahi, ka whakamahia e te whatunga ngā soft thresholding functions hei whakakore i ēnei āhuatanga (set to zero). I te taha ētahi, ka tautohu te whatunga i ngā important features, ā, ka puritia ēnei. Mā tēnei tukanga, ka piki ake te kaha o te deep neural network. Ka āwhina tēnei tukanga i te whatunga ki te tango i ngā useful features mai i ngā tohu e kī ana i te noise.

1. Research Motivation (Te Pūtake o te Rangahau)

Tuatahi, kāore e taea te karo i te noise ina whakarōpūhia ngā samples e te algorithm. Ko ngā tauira o tēnei noise ko te Gaussian noise, pink noise, me te Laplacian noise. I roto i te tirohanga whānui, he maha ngā wā kei roto i ngā samples ngā pārongo kāore e hāngai ana ki te classification task o te wā. Ka taea te kī ko ēnei pārongo hē he noise. Mā tēnei noise pea e heke ai te classification performance. (He hīkoi matua te Soft thresholding i roto i ngā signal denoising algorithms maha.)

Hei tauira, whakaarohia tētahi kōrero i te taha rori. Kei roto pea i te ororongo ngā haruru o ngā pū me ngā wīra motokā. Ka hiahiatia pea kia mahia he speech recognition ki runga i ēnei tohu. Mā ngā haruru o muri (background sounds) e whakararu ngā hua. Mai i te tirohanga deep learning, me whakakore e te deep neural network ngā features e hāngai ana ki ngā pū me ngā wīra. Mā tēnei whakakorenga e ārai te pānga kino ki ngā hua o te speech recognition.

Tuarua, he rerekē te nui o te noise i waenga i ngā samples. Ka puta tēnei rerekētanga ahakoa kei roto i te dataset kotahi. (He ōrite tēnei rerekētanga ki ngā attention mechanisms. Tangohia he image dataset hei tauira. Ko te wāhi o te target object he rerekē pea i waenga i ngā whakaahua. Ka taea e ngā attention mechanisms te arotahi ki te wāhi motuhake o te target object i ia whakaahua.)

Hei tauira, whakaarohia te whakangungu i tētahi cat-and-dog classifier me ngā whakaahua e rima kua tapaina ko “kurī” (dog). Ko te Whakaahua 1 he kurī me tētahi kiore. Ko te Whakaahua 2 he kurī me tētahi kuihi. Ko te Whakaahua 3 he kurī me tētahi heihei. Ko te Whakaahua 4 he kurī me tētahi kaihe. Ko te Whakaahua 5 he kurī me tētahi pārera. I te wā whakangungu, ka whakararu ngā ahanoa kāore e hāngai ana i te classifier. Ko ēnei ahanoa ko ngā kiore, kuihi, heihei, kaihe, me ngā pārera. Nā tēnei whakararuraru, ka heke te classification accuracy. Mēnā ka taea e tātou te tautohu i ēnei ahanoa hē. Kātahi, ka taea te whakakore i ngā features e hāngai ana ki ēnei ahanoa. Mā tēnei huarahi, ka piki ake te accuracy o te cat-and-dog classifier.

2. Soft Thresholding

Ko te Soft thresholding he hīkoi matua i roto i ngā signal denoising algorithms maha. Ka whakakorea e te algorithm ngā features mēnā he iti ake ngā absolute values o ngā features i tētahi threshold. Ka nukuhia ngā features ki te kore (zero) mēnā he nui ake ngā absolute values i tēnei threshold. Ka taea e ngā kairangahau te whakatinana i te soft thresholding mā te tātai e whai ake nei:

\[y = \begin{cases} x - \tau & x > \tau \\ 0 & -\tau \le x \le \tau \\ x + \tau & x < -\tau \end{cases}\]

Ko te derivative o te putanga soft thresholding e pā ana ki te tāuru (input) ko:

\[\frac{\partial y}{\partial x} = \begin{cases} 1 & x > \tau \\ 0 & -\tau \le x \le \tau \\ 1 & x < -\tau \end{cases}\]

E whakaatu ana te tātai i runga ake nei, ko te derivative o te soft thresholding he 1, he 0 rānei. He ōrite pū tēnei āhuatanga ki te ReLU activation function. Nō reira, ka taea e te soft thresholding te whakaiti i te tūpono o te gradient vanishing me te gradient exploding i roto i ngā deep learning algorithms.

I roto i te soft thresholding function, me ū te whakatakoto o te threshold ki ngā tikanga e rua. Tuatahi, me noho te threshold hei tau pai (positive number). Tuarua, kaua te threshold e rahi ake i te maximum value o te input signal. Ki te kore, ka noho kore (zero) te putanga katoa.

Tāpiri atu, he pai ake mēnā ka ū te threshold ki te tikanga tuatoru. Me whai threshold motuhake ia sample i runga anō i te nui o te noise i roto i taua sample.

Ko te take, he rerekē te nui o te noise i waenga i ngā samples. Hei tauira, ko te Sample A he iti te noise, engari ko te Sample B he nui te noise i roto i te dataset kotahi. I tēnei take, me whakamahi te Sample A i te threshold iti ake i te wā o te soft thresholding. Me whakamahi te Sample B i te threshold nui ake. Ahakoa ka ngaro te tikanga ōkiko (physical definition) o ēnei features me ngā thresholds i roto i ngā deep neural networks, ka noho tūturu te arorau matua. Arā, me whai threshold motuhake ia sample. Mā te nui o te noise e whakarite tēnei threshold.

3. Attention Mechanism

Ka ngāwari te mārama o ngā kairangahau ki ngā attention mechanisms i te ao o te computer vision. Ka taea e ngā pūnaha kitenga o ngā kararehe te wehewehe i ngā whāinga mā te matawai tere i te horahanga katoa. Muri iho, ka arotahi ngā pūnaha kitenga i te attention ki te target object. Mā tēnei mahi e taea ai e ngā pūnaha te tango i ngā taipitopito maha ake. I te wā ōrite, ka pēhia e ngā pūnaha ngā pārongo kāore e hāngai ana. Mō ngā taipitopito, tirohia ngā tuhinga e pā ana ki ngā attention mechanisms.

Ko te Squeeze-and-Excitation Network (SENet) he tikanga deep learning hou e whakamahi ana i ngā attention mechanisms. I waenga i ngā samples rerekē, he rerekē te koha o ngā feature channels ki te classification task. Ka whakamahi te SENet i tētahi sub-network iti hei Learn a set of weights. Kātahi, ka whakareatia e te SENet ēnei weights ki ngā features o ngā channels e hāngai ana. Mā tēnei mahi e whakatika te rahi o ngā features i ia channel. Ka taea te titiro ki tēnei tukanga hei Apply weighting to each feature channel.

Squeeze-and-Excitation Network

I tēnei tikanga, kei ia sample tōna ake huinga weights motuhake. Arā, he rerekē ngā weights mō ngā samples e rua ahakoa ko wai. I roto i te SENet, ko te ara motuhake hei tiki i ngā weights ko “Global Pooling → Fully Connected Layer → ReLU Function → Fully Connected Layer → Sigmoid Function.”

Squeeze-and-Excitation Network

4. Soft Thresholding with Deep Attention Mechanism

Ka whakamahi te Deep Residual Shrinkage Network i te hanganga o te SENet sub-network. Ka whakamahia e te whatunga tēnei hanganga hei whakatinana i te soft thresholding i raro i te deep attention mechanism. Mā te sub-network (kua tohua i roto i te pouaka whero) e Learn a set of thresholds. Kātahi, ka tonoa e te whatunga te soft thresholding ki ia feature channel mā te whakamahi i ēnei thresholds.

Deep Residual Shrinkage Network

I roto i tēnei sub-network, ka tātai tuatahi te pūnaha i ngā absolute values o ngā features katoa i te input feature map. Kātahi, ka mahia e te pūnaha te global average pooling me te averaging hei tiki i tētahi feature, e tohua ana ko A. I te ara kē (Identity path), ka tāurua e te pūnaha te feature map ki roto i tētahi fully connected network iti whai muri i te global average pooling. Ka whakamahia e tēnei fully connected network te Sigmoid function hei layer whakamutunga. Ka whakatau tēnei function i te putanga ki waenga i te 0 me te 1. Ka puta mai i tēnei tukanga he coefficient, e tohua ana ko α. Ka taea te whakaatu i te threshold whakamutunga ko α × A. Nō reira, ko te threshold he hua (product) nō ngā tau e rua. Ko tētahi tau kei waenganui i te 0 me te 1. Ko tētahi tau ko te average o ngā absolute values o te feature map. Mā tēnei tikanga e here kia noho te threshold hei tau pai (positive). Mā tēnei tikanga hoki e here kia kaua te threshold e tino nui rawa.

Tāpiri atu, ka puta ngā thresholds rerekē mō ngā samples rerekē. Nō reira, ka taea te titiro ki tēnei tikanga hei attention mechanism motuhake. Ka tautohu te mechanism i ngā features kāore e hāngai ana ki te mahi o te wā. Ka hurihia e te mechanism ēnei features kia pātata ki te kore (zero) mā ngā convolutional layers e rua. Kātahi, ka whakatau te mechanism i ēnei features ki te kore (zero) mā te whakamahi i te soft thresholding. I te taha kē, ka tautohu te mechanism i ngā features e hāngai ana ki te mahi o te wā. Ka hurihia e te mechanism ēnei features kia noho tawhiti i te kore (zero) mā ngā convolutional layers e rua. Hei whakamutunga, ka puritia e te mechanism ēnei features.

Hei kōrero whakamutunga, ka Stack many basic modules tātou. Ka tāpirihia hoki ngā convolutional layers, batch normalization, activation functions, global average pooling, me ngā fully connected output layers. Mā tēnei tukanga e hanga te Deep Residual Shrinkage Network katoa.

Deep Residual Shrinkage Network

5. Generalization Capability (Te Āheinga Whānui)

Ko te Deep Residual Shrinkage Network he tikanga whānui mō te feature learning. Ko te take, i roto i ngā mahi feature learning maha, he pinepine te kitea o te noise i roto i ngā samples. Kei roto hoki i ngā samples ngā pārongo kāore e hāngai ana. Mā tēnei noise me ngā pārongo hē e pā kino ki te mahi o te feature learning. Hei tauira:

Whakaarohia te image classification. Kei roto pea i tētahi whakaahua ngā ahanoa kē maha. Ka taea te titiro ki ēnei ahanoa hei “noise.” Ka taea pea e te Deep Residual Shrinkage Network te whakamahi i te attention mechanism. Ka kite te whatunga i tēnei “noise.” Kātahi, ka whakamahi te whatunga i te soft thresholding hei whakakore i ngā features e hāngai ana ki tēnei “noise” (set to zero). Mā tēnei mahi pea e piki ake te accuracy o te image classification.

Whakaarohia te speech recognition. Ina koa, i ngā taiao hoihoi pēnei i ngā kōrero i te taha rori, i roto rānei i te wheketere. Ka taea pea e te Deep Residual Shrinkage Network te whakapiki i te accuracy o te speech recognition. Ko te iti rawa rānei, ka horo’a e te whatunga he tikanga (methodology). He tikanga tēnei e āhei ana ki te whakapiki i te accuracy o te speech recognition.

Reference (Tohutoro)

Minghang Zhao, Shisheng Zhong, Xuyun Fu, Baoping Tang, Michael Pecht, Deep residual shrinkage networks for fault diagnosis, IEEE Transactions on Industrial Informatics, 2020, 16(7): 4681-4690.

https://ieeexplore.ieee.org/document/8850096

BibTeX

@article{Zhao2020,
  author    = {Minghang Zhao and Shisheng Zhong and Xuyun Fu and Baoping Tang and Michael Pecht},
  title     = {Deep Residual Shrinkage Networks for Fault Diagnosis},
  journal   = {IEEE Transactions on Industrial Informatics},
  year      = {2020},
  volume    = {16},
  number    = {7},
  pages     = {4681-4690},
  doi       = {10.1109/TII.2019.2943898}
}

Academic Impact (Pānga ā-Mātauranga)

Kua riro i tēnei pepa rangahau neke atu i te 1,400 ngā tohutoro (citations) i runga i a Google Scholar.

I runga i ngā tatauranga kāore i te oti, kua whakamahia e ngā kairangahau te Deep Residual Shrinkage Network (DRSN) i roto i ngā tuhinga/rangahau neke atu i te 1,000. Kapi ana i ēnei tono ngā ahumahi whānui. Kei roto i ēnei ko te mechanical engineering, electrical power, vision, healthcare, speech, text, radar, me te remote sensing.