最早用elasticsearch应该是16年11月分左右, 爬虫爬回来的数据存到里面去,然后到后来为了做日志分析又玩了一段时间.后来还经历过大量es暴露公网的事情,版本跳跃的太快.最后看着她又多出了machine learning功能. 突然想起来大三时用luence搭建搜索引擎没成功.

Logstash

这是我的日志文件

2009-12-31 00:00:04 W3SVC1 22x.xxxx.7.21 GET /Newmsg/rccp_news.asp classid=233&siteid=2910 80 - 124.115.1.59 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1) 200 0 0
2009-12-31 00:00:27 W3SVC1 22x.xxxx.7.21 GET /luntan/topic_stats.asp sort_order=T_VIEW_COUNT&whichpage=3 80 - 124.115.1.59 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1) 200 0 0
2009-12-31 00:01:03 W3SVC1 22x.xxxx.7.21 GET /rdxw.asp Page=60 80 - 67.195.114.58 Mozilla/5.0+(compatible;+Yahoo!+Slurp/3.0;+http://help.yahoo.com/help/us/ysearch/slurp) 200 0 0
2009-12-31 00:01:13 W3SVC1 22x.xxxx.7.21 GET /rsdt.asp Page=24 80 - 124.115.4.205 Sosospider+(+http://help.soso.com/webspider.htm) 200 0 0

写个匹配去匹配这些东西.

%{TIMESTAMP_ISO8601:YYYY-MM-dd HH:mm:ss} %{WORD:serviceName} %{IP:serverIP} %{WORD:method} %{URIPATH:uriStem} %{NOTSPACE:uriQuery} %{NUMBER:port} %{NOTSPACE:username} %{IPORHOST:clientIP} %{NOTSPACE:userAgent} %{NUMBER:response} %{NUMBER:subresponse} %{NUMBER:win32response}"

这是我完整的isslog.conf文件

input {  
  file {
    type => "iis-w3c"
    path => "/home/I/biglog/log/2010/*.log"
    start_position => "beginning"
  }

}


filter {
 #ignore log comments
 if [fusion_builder_container hundred_percent="yes" overflow="visible"][fusion_builder_row][fusion_builder_column type="1_1" background_position="left top" background_color="" border_size="" border_color="" border_style="solid" spacing="yes" background_image="" background_repeat="no-repeat" padding="" margin_top="0px" margin_bottom="0px" class="" id="" animation_type="" animation_speed="0.3" animation_direction="left" hide_on_mobile="no" center_content="no" min_height="none"][message] =~ "^#" {
 drop {}
 }
 
 
 grok {
 # check that fields match your IIS log settings
 match => ["message", "%{TIMESTAMP_ISO8601:log_timestamp} %{WORD:serviceName} %{IP:serverIP} %{WORD:method} %{URIPATH:uriStem} %{NOTSPACE:uriQuery} %{NUMBER:port} %{NOTSPACE:username} %{IPORHOST:clientIP} %{NOTSPACE:userAgent} %{NUMBER:response} %{NUMBER:subresponse} %{NUMBER:win32response}"]
 }
 #Set the Event Timesteamp from the log
 date {
 match => [ "log_timestamp", "YYYY-MM-dd HH:mm:ss" ]
 timezone => "Etc/UTC"
 }
 useragent {
 source=> "useragent"
 prefix=> "browser"
 }
 mutate {
 remove_field => [ "log_timestamp"]
 }
}
 
output {
stdout {}
elasticsearch {
hosts => ["127.0.0.1:9200"]
}
stdout { codec => rubydebug }
}

写好配置之后直接logstash -f isslog.conf ,但是我导入之后,在kibana里面看不到,去es里查,发现已经导进去了,但是就是看不到,后来突然想到是时间不对,分析的这份日志时间跨度为09-14年,所以当把时间区间调到这个之后,就在kibana里面看到结果了.

grokdebug online

X-pack

版本一定要一样,不一样不行,小版本号也要一样.安装可以直接xxx-plugin install x-pack (xxx为elk中的任意一个),但是速度太慢,可以先挂代理下载到本地,然后本地安装,直接 file://指向即可,我用的是6.0,所以直接
/kibana-6.0.0-linux-x86_64/bin/kibana-plugin install file:///home/mour/devops/elk/x-pack-6.0.0.zip 就可以了,安装之后,记得重启kibana.
x-pack提供了机器学习功能(这就是我为啥安装他),但是这个只有30天试用.具体激活看文档.我还在导数据
curl -XGET -u root:root 'http://localhost:9200/_xpack/license/trial_status'
直接在配置里写上:

xpack.license.self_generated.type: trial

机器学习

根据分析结果来看,性能非常稳定,而且十分迅速,即便针对亿级的数据进行机器学习,基本以秒级的速度给出结果,均速扫描整个数据集.完全可以实时的对新的数据流以feed的形势添加进来分析,感觉自己把数据从es导入到spark,利用spark ml比较靠谱.(es本身也提供了一些无监督学习算法以及基于时间序列的异常数据检测)

其他

各个配置文件都在对应目录的config目录下,很容易看的懂,需要什么查查文档,不难.
uniq一定要配合sort否则结果不对,不如用awk,find东西时,单独exec没有用xargs快
有个elasticsearchdump 用来备份es数据,非常好用,基于nodejs的,但是速度不理想(测试远程服务器时.本地尚未测试)

使用./elasticsearch-6.0.0/bin/x-pack/setup-passwords auto 自动设置密码,或者使用interative自己输入.设置之后,需要去kibana里配置下es的用户名密码,找到注释的那一项,取消掉,填进去刚刚自动生成的用户名密码.然后呢,在禁用掉x-pack 的security选项,配置文件中添加xpack.security.enabled: false即可.

生成密码之后,logstash要配置一下,然后才能接着向elasticsearch导入数据
可以写在input里,也可以写在input外
1
2
3
4
5
elasticsearch {
hosts => ["127.0.0.1:9200"]
username => "elastic"
password => "epPz+Oxxxxxxxx&VWnrc"
}
配置文档
使用ml需要分别在kibnana和elastic里面配置一下
kibana
1
xpack.ml.enabled : true
elastic
1
2
3
xpack.ml.enabled : true
node.ml : true

为什么es 有的索引是red

bash

#!/bin/bash

for index in $(curl  -s 'http://localhost:9200/_cat/shards' | grep UNASSIGNED | awk '{print $1}' | sort | uniq); do
    for shard in $(curl  -s 'http://localhost:9200/_cat/shards' | grep UNASSIGNED | grep $index | awk '{print $2}' | sort | uniq); do
        echo  $index $shard

        curl -XPOST 'localhost:9200/_cluster/reroute' -d "{
            'commands' : [ {
                  'allocate' : {
                      'index' : $index,
                      'shard' : $shard,
                      'node' : 'Master',
                      'allow_primary' : true
                  }
                }
            ]
        }"

        sleep 5
    done
done

解决办法参考自这里
 查找错误

overhead错误

不知道为什么,启动一会儿就没有了,状态全部正常之后,又出现了overhead错误

为什么out of memory 明明还有20G的内存

这是java的内存堆导致的,需要手动设置才行.编辑config下面的jvm.options的文件,然后给-Xms1g改大点,改成-Xms8g就可以了,改完之后,不会有out of memory, 也没有head map问题,也没有cluster block问题了,就可以用了.但是文档也说了,即便你有1个T的内存,也不要分配超过32G的内存给他.
Out of memory

Todo:

利用spark ml进行分析