一种具有分类细化功能的垃圾语言识别方法
A SPIT recognition method with refined classification
-
摘要: 为了筛选出散播垃圾语音的用户,建立了一种采用基于加权k-means和支持向量机的垃圾语言识别方法.该方法依据用户的历史通信活动建立通信行为网络模型,用加权的k-means算法对用户进行半监督聚类,然后从每个类中均匀选取部分用户数据作为训练集,采用支持向量机获得训练模型用以预测剩余用户数据.实验结果表明,该方法的用户分类更细化,并具备预测功能,有一定的机器学习能力,可用于大客户发现及关联客户发现和业务推荐等.Abstract: In order to screen the spreading spam over Internet telephony(SPIT) user,a recognition method was built based on weighted k-means and support vector machine(SVM).This method built a communication network model according to historical communication activities of customers,and clustered semi-supervised by weighted k-means algorithm.Then it equally selected part of customers data from each classified cluster as the training set and finally processed the rest data by using SVM method.Experimental data showed that this method could make the classification more refined and had forecast function and certain ability of machine learning.It can be used for the discovery of important customers,relevant customers and service recommendation,etc
-
Key words:
- ata mining /
- k-means /
- support vector machine(SVM) /
- spam over Internet telephony
-
-
[1]
何光宇,闻英友,赵宏.基于反馈评判的SPIT检测与防范方法[J].东北大学学报:自然科学版,2009,30(4):526.
-
[2]
何光宇,闻英友,赵宏.固定移动融合网络中基于资源挑战的垃圾语音防范方法[J].计算机学报,2012,35(1):38.
-
[3]
王菲,莫益军,黄本雄.基于信誉的P2P-VoIP垃圾语音过滤模型[J].华中科技大学学报:自然科学版,2008,36(8):62.
-
[4]
张卫兵,魏更宇,黄玮,等.一种基于布鲁姆过滤器的网络垃圾语音检测方法[J].信息工程大学学报,2010,11(5):557.
-
[5]
夏惠芬,董卫民.基于关联规则的Web挖掘技术研究[J].现代电子技术,2011(16):101.
-
[6]
张雪风,张桂珍,刘鹏.基于聚类准则函数的改进k-means算法[J].计算机工程与应用,2011,47(11):123.
-
[7]
李健森,白万民.一种改进的距离度量的聚类算法[J].电子设计工程,2012,20(22):86.
-
[8]
王立梅,李金凤,岳琪.基于k均值聚类的直推式支持向量机学习算法[J].计算机工程与应用,2013,49(14):144.
-
[9]
徐红,彭力,陈容.基于优化支持向量机的人脸表情分类[J].计算机应用研究,2013,30(8):2541.
-
[10]
奉国和.SVM分类核函数及参数选择比较[J].计算机工程与应用,2011,47(3):123.
-
[11]
Watts D J,Strogatz S H.Collective dynamics of 'small-world' etworks[J].Nature,1998,393(6684):440.
-
[1]
计量
- PDF下载量: 152
- 文章访问数: 8035
- 引证文献数: 0