Deep Belief Network

Adarsha Regmi
4 min readJan 15, 2022

a model that has ruled in AI field. In this page I have tried to explain the most of the things.

  • A deep belief network is the deep model architecture that has the composition of both the unsupervised and supervised learning.
DBN architecture

In this page I have focused in following section……..

  • What is a Boltzmann Machine?
  • Restricted Boltzmann Machine
  • Deep Belief Network

Are you feeling dilemma don’t worry with the completion of this you’ll be a expert in DBN to march ? Belief me or in nets. I know you don't believe me. But have it deep whichever side u are ?

  1. Boltzmann Machine

What the heck is it ?

easier : It is just a machine . but also known as also known as Gibbs Distribution and Energy-Based Models — EBM.

what they do? (not necessary)

In Quantum state the parameters like Entropy and temperature impact are observed.

Strange thing : It is a model but no output nodes. ???

If you known about ml, simply we have a output and based upon the different learning rule such as gradient descend we learn the values for parameters for weight, and other parameters.(calling it as a learning model) The hidden nodes learn or map the things from given input represented by v in above image. It falls under unsupervised learning as you know it.

2. Restricted Boltzmann machine (RBM)

two layers of RBM

The two layered architecture that can learn probability distribution over set of parameters .It is used for different application such as dimension reduction, feature learning, classification, regression, etc.

It is slightly different than Boltzmann machine as the nodes at the same layer are not interconnected.

Forward Pass + Backward Pass

Forward Pass

forward pass

Backward Pass

The weights associated with each neuron are randomly initialized in RBM, and then we execute alternate Gibbs sampling:

The current states of a units in the other layer are used to refresh all the units in a layer in parallel, and this process is repeated until the system is sampling from its equilibrium distribution.

The probability is

𝑃(ℎ 𝑗 = 1|𝒗) = ℊ (𝑏𝑗 + ∑i V𝑖 . W𝑖𝑗 )

where g is sigmoid function

Since calculating the model is difficult. It can be done using visible units and performing Gibbs sampling for a long time it it impossible to use this as a approach to solve a problem. In that case, CD(Contrastive Divergence )us used which is explained below.

Contrastive Divergence (CD)

Since the process is longer, CD is used. At first visible nodes are initialized, then using the probability equation hidden nodes are calculated.

𝑃(ℎ 𝑗 = 1|𝒗) = ℊ (𝑏𝑗 + ∑i V𝑖 . W𝑖𝑗 )

later the visible nodes are computed accordingly. We try to find the minimum energy state or in a sense the minimum and maximum function.

3. Deep Belief Network

DBN is the architecture build of stacked RBMs. DBN performs non-linear transformation on input vectors and output vectors are used as a input to next RBM. Being a generative model allows DBNs to be used in either an unsupervised or a supervised setting.

Precisely, in feature learning we do layer-by-layer pre-training in an unsupervised manner on the different RBMs that form a DBN and we use back-propagation technique(i.e. gradient descent) to do classification and other tasks by fine-tuning on a small labelled dataset.


5. Application of DBN

  • Image Classification
  • Motion capture
  • video-recognition
  • Image Generation

For implementation check in this link

For sample testing check this site

For application using DBN, I am building a GitHub repository. I will try to include in this as soon as possible. For video explanation check my youtube link.

For Github check