登录 | 注册 | 退出 | 公司首页 | 繁体中文 | 满意度调查
综合馆
基于混合概率模型的无监督离散化算法
  • 摘要

    现实应用中常常涉及许多连续的数值属性,而目前许多机器学习算法则要求所处理的属性取离散值.根据在对数值属性的离散化过程中,是否考虑相关类别属性的值,离散化算法可分为有监督算法和无监督算法两类.基于混合概率模型,该文提出了一种理论严格的无监督离散化算法,它能够在无先验知识、无类别属性的前提下,将数值属性的值域划分为若干子区间,再通过贝叶斯信息准则自动地寻求最佳的子区间数目和区间划分方法.

  • 作者

    李刚   

  • 作者单位

    迪多大学计算与数学学院,VIC,3168,澳大利亚/上海大学计算机科学系,上海,201800

  • 刊期

    2002年2期 ISTIC EI PKU

  • 关键词

    人工智能  机器学习  离散化  混合概率模型 

参考文献
  • [1] Catlett J. On changing continuous attributes into ordered discreteattributes. LNAI-482,Porto,Portugal, 1991
  • [2] Dempster A P;Laird N M;Rubin D B. Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society,Series B(Statistical Methodology), 1977,01
  • [3] Xu Lei;Jordan M I. On Convergence properties of the EM algorithm for Gaussianmixtures. Neural Computation, 1996,01
  • [4] Dougherty J;Kohavi R;Sahami M. Supervised and unsupervised discretization of continuous features. Morgan KaufmannPublishers, 1995
  • [5] Quinlan J R. C4.5:Programs for Machine Learning. San Mateo:Morgan Kaufmann, 1993
  • [6] Fayyad U;Irani K. Multi-interval discretization of continuous-valued attributes for classification learning. San Mateo,CA.Morgan Kaufmann Publishers, 1993
  • [7] Li G;Tong F. WILD: Weighted information-loss discretization algorithm forordinal attributes. the16th IFIP World Computer Congress 2000,Beijing,China, 2000
  • [8] Kass R E;Raftery A E. Bayes factors. Journal of the American Statistical Association, 1995,90
  • [9] Schwarz G. Estimating the dimension of a model. Annals of Statistics, 1978,02
  • [10] Kohavi R. MLC++: A machine learning library in C++. Tools WithArtificial Intelligence. IEEE Computer Society Press, 1994
  • [11] Banfield J D;Raftery A E. Model based Gaussian and non-Gaussian clustering. Biometrics, 1993,03
  • [12] MaCKay D J C. Information Theory, Inference and Learning Algorithms. Cambridge:Cambridge University Press, 2000
  • [13] Quinlan J R. Improved use of continuous attributes in C4.5. Journal of Artificial Intelligence Research, 1996,01
  • [14] Wong A K C;Chiu D K Y. Synthesizing statistical knowledge from incompletemixed-mode data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1987,06
查看更多︾
3.231.220.139