Generative models and discriminative models

Let’s say we have input data $X$ and we want to classify the data into labels $Y$.

  • A generative model (生成模型) learns the joint probability distribution $\mathbb{P}(X, Y)$
  • A discriminative model (判别模型) learns the conditional probability distribution $\mathbb{P}(Y | X)$

Discriminative models learn the (hard or soft) boundary between classes, generative models model the distribution of individual classes.

Examples of generative models:

  • HMM (hidden markov model)
  • Naive Bayes
  • Gaussian mixture model and other types of mixture model
  • GAN (generative adversarial network)
  • VAE (variational autoencoder)
  • PCFG (probabilistic context-free grammar)
  • Latent Dirichlet allocation
  • Boltzmann machine

Examples of discriminative models:

  • CRF (conditional random field)
  • Logistic regression
  • SVM
  • Decision tree
  • Random forest
  • KNN (k-nearest neighbors)
  • Neural networks
  • Classical (sparse, denoising, etc.) autoencoders are generally classified as discriminative models, although denoising autoencoders can be used as generative models [1].

For sequence labeling tasks (such as Part-of-Speech tagging and Name Entity Recognition), discriminative models are said to be preferred to generative models [2].

There is a widely-held belief that discriminative models are almost always to be preferred in classification tasks [3].

One of the advantages of generative models is that they can be used to generate new data similar to existing data.

References

[1] Bengio, Y., Yao, L., & Alain, P. (2013). Generalized Denoising Auto-Encoders as Generative ModelsarXiv e-prints, arXiv:1305.6663.

[2] Why discriminative models are preferred to generative models for sequence labeling tasks? (n.d.). Cross Validated. https://stats.stackexchange.com/questions/73015/why-discriminative-models-are-preferred-to-generative-models-for-sequence-labeli

[3] Andrew Y. Ng and Michael I. Jordan, On discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes, Advances in Neural Information Processing Systems 14 (T. G. Dietterich, S. Becker, and Z. Ghahramani, eds.), MIT Press, 2002, pp. 841–848.