Пожалуйста, используйте этот идентификатор, чтобы цитировать или ссылаться на этот ресурс: http://elar.urfu.ru/handle/10995/3708
Название: Transforming Message Detection
Авторы: Ermakova, L.
Дата публикации: 2011
Издатель: St. Petersburg University Press
Библиографическое описание: Ermakova L. Transforming Message Detection / L. Ermakova // Web of Data: The joint RuSSIR/EDBT 2011 Summer School, August 15–19, 2011, Proceedings of the Fifth Russian Young Scientists Conference in Information Retrieval / B. Novikov, P. Braslavsky (Eds.). — St. Petersburg, 2011 — P. 15-29.
Аннотация: The majority of existing spam filtering techniques suffers from several serious disadvantages. Some of them provide many false positives. The others are suitable only for email filtering and may not be used in IM and social networks. Therefore content methods seem to be more efficient. One of them is based on signature retrieval. However it is not change resistant. There are enhancements (e.g. checksums) but they are extremely time and resource consuming. That is why the main objective of this research is to develop a transforming message detection method. To this end we have compared spam in various languages, namely English, French, Russian and Italian. For each language the number of examined messages including spam and notspam was about 1000. 135 quantitative features have been retrieved. Almost all these features do not depend on the language. They underlie the first step of the algorithm based on support vector machine. The next stage is to test the obtained results applying N-gram approach. Special attention is paid to word distortion and text alteration. The obtaining results indicate the efficiency of the suggested approach.
Ключевые слова: SPAM
TRANSFORMING MESSAGE
N-GRAMS
SVM
DAMERAU-LEVENSHTEIN DISTANCE
URI: http://elar.urfu.ru/handle/10995/3708
Конференция/семинар: V Russian Summer School in Information Retrieval (RuSSIR’2011)
V Российская летняя школа по информационному поиску (RuSSIR’2011)
EDBT Summer Schools
Дата конференции/семинара: 15.08.2011–19.08.2011
ISBN: 978-5-288-05225-5
Источники: RuSSIR/EDBT2011
Располагается в коллекциях:Информационный поиск

Файлы этого ресурса:
Файл Описание РазмерФормат 
RuSSIR_2011_02.pdf427,68 kBAdobe PDFПросмотреть/Открыть


Все ресурсы в архиве электронных ресурсов защищены авторским правом, все права сохранены.