Deep Residual Shrinkage Network: Artificial Intelligence fɛɛrɛ min bɛ kɛ Data minnu na Noise ka ca

Deep Residual Shrinkage Network ye Deep Residual Network ka ɲɛta ye. Tiɲɛ na, a bɛ Deep Residual Network, attention mechanisms, ani soft thresholding functions fara ɲɔgɔn kan.

An bɛ se k’a faamu cogo min na: Deep Residual Shrinkage Network bɛ attention mechanisms kɛ ka features minnu man fɔrɔ dɔn, o kɔ a bɛ soft thresholding functions kɛ k’u kɛ zero ye; nka, features minnu ka fɔrɔ, a b’u mara. Nin process bɛ deep neural network dɛmɛ ka useful features bɔ signals kɔnɔ minnu na noise bɛ.

1. Research Motivation

Fɔlɔ, n’an bɛ samples classify, noise — i n’a fɔ Gaussian noise, pink noise, walima Laplacian noise — bɛ sɔrɔ a la. A ka tɛmɛ o kan, samples ka teli ka information mara min tɛ current classification task dɛmɛ, o fana bɛ se ka kɛ noise ye. Nin noise bɛ se ka classification performance tiɲɛ. (Soft thresholding ye kunkoba ye signal denoising algorithms caman kɔnɔ.)

Misali la, ni mɔgɔw bɛ baro kɛ sira da la, mobili mɛkɛkɛkan ni wotoro pinu kan bɛ se ka don kumakan na. N’an bɛ speech recognition kɛ nin signals kan, results bɛ se ka tiɲɛ nin background sounds kɔsɔn. Deep learning fɛ, features minnu bɛ bɛn mobili mɛkɛkɛkan ni wotoro pinu ma, an ka kan k’u bɔ deep neural network kɔnɔ, walisa u kana speech recognition results tiɲɛ.

Filanan, hali dataset kelen kɔnɔ, noise hakɛ bɛ yɛlɛma sample kelen-kelen bɛɛ la. (Nin bɛ i n’a fɔ attention mechanisms; ni an ye image dataset ta misali ye, target object yɔrɔ bɛ se ka yɛlɛma images caman kan, wa attention mechanisms bɛ se ka attention di target object yɔrɔ ma image kelen-kelen bɛɛ la.)

Misali la, ni an bɛ cat-and-dog classifier train, an bɛ se ka image 5 lajɛ minnu label ye “dog” ye. Image fɔlɔ bɛ se ka kɛ wulu ni ɲinɛ ye, filanan bɛ se ka kɛ wulu ni wɔlɔ ye, sabanan bɛ se ka kɛ wulu ni shɛ ye, naaninan bɛ se ka kɛ wulu ni fali ye, duuranan bɛ se ka kɛ wulu ni dɔnɔ ye. Training tuma na, classifier bɛ se ka ɲagami nin irrelevant objects fɛ — i n’a fɔ ɲinɛ, wɔlɔ, shɛ, fali, ani dɔnɔ — o bɛ classification accuracy dɔgɔya. Ni an bɛ se ka nin irrelevant objects dɔn — ɲinɛ, wɔlɔ, shɛ, fali, ani dɔnɔ — ka u features bɔ yen, an bɛ se ka cat-and-dog classifier ka accuracy layɛlɛ.

2. Soft Thresholding

Soft thresholding ye kunkoba ye signal denoising algorithms caman kɔnɔ. A bɛ features minnu absolute values ka dɔgɔ ni threshold ye, a b’u bɔ yen; minnu absolute values ka bon ni threshold ye, a b’u dɔgɔya ka bɛn zero ma. A bɛ se ka kɛ ni formula nin ye:

\[y = \begin{cases} x - \tau & x > \tau \\ 0 & -\tau \le x \le \tau \\ x + \tau & x < -\tau \end{cases}\]

Soft thresholding output ka derivative ni input ye:

\[\frac{\partial y}{\partial x} = \begin{cases} 1 & x > \tau \\ 0 & -\tau \le x \le \tau \\ 1 & x < -\tau \end{cases}\]

I n’a fɔ a bɛ ye san fɛ, soft thresholding ka derivative ye 1 walima 0 ye. Nin property bɛ i n’a fɔ ReLU activation function ta. O la, soft thresholding bɛ se ka deep learning algorithms tanga gradient vanishing ni gradient exploding ma.

Soft thresholding function kɔnɔ, threshold ka kan ka bɛn sarati fila ma: fɔlɔ, threshold ka kan ka kɛ positive number ye; filanan, threshold man kan ka tɛmɛ input signal ka maximum value kan, n’o tɛ output bɛɛ bɛ kɛ zero ye.

A ka fisa fana threshold ka bɛn sarati sabanan ma: sample kelen-kelen bɛɛ ka kan ka sɔrɔ a yɛrɛ ka independent threshold ka kɛɲɛ ni a ka noise content ye.

Mun na? Bawo noise content ka teli ka yɛlɛma samples cɛ. Misali la, a ka teli ka kɛ ko dataset kelen kɔnɔ, Sample A ka noise bɛ se ka dɔgɔ, Sample B ta bɛ se ka ca. O la, n’an bɛ soft thresholdingdenoising algorithm kɔnɔ, Sample A ka kan ka threshold dɔgɔman ta, Sample B ka kan ka threshold belebele ta. Hali ni nin features ni thresholdsphysical definitions jɔnjɔn sɔrɔ deep neural networks kɔnɔ, a logic ye kelen ye. Ko, sample kelen-kelen bɛɛ ka kan ka sɔrɔ a yɛrɛ ka independent threshold ka kɛɲɛ ni a ka noise content ye.

3. Attention Mechanism

Attention mechanisms ka nɔgɔ ka faamu computer vision field kɔnɔ. Bago kɔnɔ, an ɲɛ bɛ se ka yɔrɔ bɛɛ lajɛ joona, ka attention di target object ma walisa ka details caman sɔrɔ, ka irrelevant information dɔgɔya. Walisa ka dɔn a la kosɛbɛ, aw bɛ se ka gafe wɛrɛw lajɛ attention mechanisms kan.

Squeeze-and-Excitation Network (SENet) ye deep learning method kura ye min bɛ attention mechanisms kɛ. Samples caman kɔnɔ, feature channels minnu bɛ classification task dɛmɛ, u bɛɛ ka nɔ tɛ kelen. SENetsub-network dɔgɔman kɛ ka “Learn a set of weights”, o kɔ a bɛ “Apply weighting to each feature channel” walisa ka features hakɛ adjust. Nin process bɛ se ka kɛ i n’a fɔ attention hakɛ min bɛ di feature channels ma.

Squeeze-and-Excitation Network

Cogo nin na, sample kelen-kelen bɛɛ bɛ ni a ta set of weights ye. An bɛ se k’a fɔ ko sample fila o fila, u ka weights tɛ kelen ye. SENet kɔnɔ, sira min bɛ ta ka weights sɔrɔ, o ye “Global Pooling → Fully Connected Layer → ReLU Function → Fully Connected Layer → Sigmoid Function” ye.

Squeeze-and-Excitation Network

4. Soft Thresholding with Deep Attention Mechanism

Deep Residual Shrinkage NetworkSENet sub-network structure ta ka soft thresholding kɛ ni deep attention mechanism ye. A bɛ sub-network (min bɛ red box kɔnɔ) kɛ ka “Learn a set of thresholds” walisa ka soft thresholdingfeature channel kelen-kelen bɛɛ la.

Deep Residual Shrinkage Network

Sub-network nin kɔnɔ, an bɛ absolute values calculé input feature map bɛɛ la fɔlɔ. O kɔ, global average pooling fɛ, an bɛ feature sɔrɔ, min tɔgɔ ye A. Identity path kɔ, an bɛ feature map min sɔrɔla global average pooling fɛ, k’a don fully connected network kɔnɔ. Fully connected network nin bɛ Sigmoid function kɛ a laban na, ka output kɛ 0 ni 1 cɛ, ka coefficient sɔrɔ, min tɔgɔ ye α. Threshold laban ye α × A. O la, threshold ye: jate min bɛ 0 ni 1 cɛ × feature map ka absolute values ka average. Fɛɛrɛ nin bɛ a to threshold bɛ kɛ positive, wa a tɛ kɛ belebeleba ye.

Fana, samples minnu tɛ kelen, u ka thresholds tɛ kɛ kelen ye. O la, an bɛ se k’a faamu i n’a fɔ specialized attention mechanism: a bɛ features minnu tɛ current task dɛmɛ, a b’u kɛ i n’a fɔ zero ni convolutional layers fila ye, ka “Soft thresholding” kɛ k’u kɛ zero ye; nka, features minnu bɛ current task dɛmɛ, a b’u to yen.

Laban, an bɛ “Stack many basic modules” ani convolutional layers, batch normalization, activation functions, global average pooling, ani fully connected output layers fara ɲɔgɔn kan, ka Deep Residual Shrinkage Network dafa.

Deep Residual Shrinkage Network

5. Generalization Capability

Deep Residual Shrinkage Network tiɲɛ na ye general feature learning method ye. Bawo feature learning tasks caman kɔnɔ, samples ka teli ka noise ani irrelevant information mara. Nin noise ni irrelevant information bɛ se ka feature learning tiɲɛ. Misali la:

Image classification la, ni image bɛ ni objects wɛrɛw ye, olu bɛ se ka kɛ “noise” ye. Deep Residual Shrinkage Network bɛ se ka attention mechanism kɛ ka nin “noise” dɔn, o kɔ ka soft thresholding kɛ ka features minnu bɛ bɛn nin “noise” ma, k’u kɛ zero ye. Nin bɛ se ka image classification accuracy layɛlɛ.

Speech recognition la, ni yɔrɔ na noise ka ca, i n’a fɔ baro sira da la walima usine kɔnɔ, Deep Residual Shrinkage Network bɛ se ka speech recognition accuracy layɛlɛ, walima a bɛ fɛɛrɛ di min bɛ se k’a layɛlɛ.

Reference

Minghang Zhao, Shisheng Zhong, Xuyun Fu, Baoping Tang, Michael Pecht, Deep residual shrinkage networks for fault diagnosis, IEEE Transactions on Industrial Informatics, 2020, 16(7): 4681-4690.

https://ieeexplore.ieee.org/document/8850096

BibTeX

@article{Zhao2020,
  author    = {Minghang Zhao and Shisheng Zhong and Xuyun Fu and Baoping Tang and Michael Pecht},
  title     = {Deep Residual Shrinkage Networks for Fault Diagnosis},
  journal   = {IEEE Transactions on Industrial Informatics},
  year      = {2020},
  volume    = {16},
  number    = {7},
  pages     = {4681-4690},
  doi       = {10.1109/TII.2019.2943898}
}

Academic Impact

Paper nin cites bɛ tɛmɛ 1,400 kan Google Scholar kan.

An k’a lajɛ, Deep Residual Shrinkage Network (DRSN) kɛra walima a changera ka kɛ publications/studies 1,000 ni kɔ kɔnɔ, fields caman na, i n’a fɔ mechanical engineering, electrical power, vision, healthcare, speech, text, radar, ani remote sensing.