python日志分析脚本
一、概述
当客户没有IPS、日志分析系统,又要求做日志分析时,还不想花钱使用网上需要付费的日志分析系统时,就只能手工进行日志分析,但其他免费的日志分析工具又用不习惯时,这时候可以自己编写脚本方便日志的梳理和整理,然后再利用notepad++等其他文本编辑器来对日志进行分析。
二、脚本结构
我已经编写好一个用于处理IIS日志的脚本,将脚本分成了四个部分:
1.主函数
2.用户交互函数
3.匹配函数
4.统计函数
三、脚本分析
1.主函数
仅仅是为了让用户输入日志路径和调用用户交互函数,并且新增用于处理日志信息的文件。
2.用户交互函数
这里我使用的是一种交互式的调用,需要用户一步一步进行操作。
①首先先选择梳理类型。
action = ['1.Ip List','2.Sql Injection List','3.Filter specific IP','4.Address List','5.XSS List','6.Scanner List','7.File Include List','8.All Type List','0.Exit']
有9个选择:
(1)Ip List
这个用于梳理访问IP的,会调用匹配函数和统计函数,最终能获得类似这样的文件。
第一个参数是ip,第二个参数是访问次数。
如果想要整理成一个图表就可以使用excel根据空格进行分割,这个清楚excel的都懂。
(2)Sql Injection List
这个用于将SQL注入攻击的日志记录梳理出来,会调用匹配函数。
但匹配规则可能不够完善,如果各位喜欢这个脚本的话可能会继续更新。
(3)Filter specific IP
这个用于将用户输入的ip的日志记录梳理出来,会调用匹配函数,需要用户输入一个IP,然后脚本能将访问ip梳理出来。
(4)Address List
这个用于梳理访问地址的,会调用匹配函数和统计函数,最终能获得类似这样的文件。
第一个参数是address,第二个参数是访问次数。
(5)XSS List
这个用于将XSS攻击的日志记录梳理出来,会调用匹配函数。
匹配规则可能不够完善,如果各位喜欢这个脚本的话可能会继续更新。
(6)Scanner List
这个用于将扫描器的扫描日志记录梳理出来,会调用匹配函数。
匹配规则可能不够完善,如果各位喜欢这个脚本的话可能会继续更新。
(7)File Include List
这个用于将文件包含的日志记录梳理出来,会调用匹配函数。
匹配规则可能不够完善,如果各位喜欢这个脚本的话可能会继续更新。
(8)All Type List
这个用于将以上除了Filter specific IP的都梳理一遍,这个功能可能用的会比较多,能一次性把扫描都过一遍,但文件名会根据默认的进行新增。
(0)Exit
这个脚本我使用的是不断循环的方式,因此关闭的话可以输入0
②在用户输入完梳理的选项后就开始需要用户输入输出的文件名,如果没有输入文件名,便会以默认文件名进行保存。
3.匹配函数
ip和address匹配规则是根据IIS的字段匹配的
#Fields: date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs(User-Agent) cs(Referer) sc-status sc-substatus sc-win32-status time-taken
漏洞类型的则是根据整个字段进行匹配
以下是各个匹配规则:
(1)Ip List
comp = re.compile(ur'\S+\s+\S+\s+\S+\s\S+\s\S+\s\S+\s\S+\s\S+\s(\S+)\s\S+\s\S+\s\S+\s\S+\s\S+\s')
(2)Sql Injection List
comp = "%20select%20|%20and%201=1|%20and%201=2|%20exec|%27exec| information_schema.tables|%20information_schema.tables|%20where%20|%20union%20|%20SELECT%20|%2ctable_name%20|cmdshell|%20table_schema"
(3)Filter specific IP
comp = re.compile(ur'\S+\s+\S+\s+\S+\s\S+\s\S+\s\S+\s\S+\s\S+\s(\S+)\s\S+\s\S+\s\S+\s\S+\s\S+\s')
(4)Address List
comp = re.compile(ur'\S+\s+\S+\s+\S+\s\S+\s(\S+)\s\S+\s\S+\s\S+\s\S+\s\S+\s\S+\s\S+\s\S+\s\S+\s')
(5)XSS List
comp = "%3C|%3c|%3E|%3e|%253c|%253C|%253E|%253e|alert|confirm|prompt|document.cookie|%3cscript|javascript|window.open|document.write|xss"
(6)Scanner List
comp = "sqlmap|acunetix|Netsparker|nmap|wvs|Appscan|Webinspect|Rsas|Nessus|WebReaver"
(7)File Include List
comp = "/passwd|%00|/win.ini|/my.ini|/MetaBase.xml|/ServUDaemon.ini|/shadow"
4.统计函数
这个函数需要用到一个log.txt进行分析,匹配函数中IP和Address的梳理会调用到该函数,这个函数仅仅用于统计他们出现的字数,并且从高到低进行排序。
四、脚本代码
# coding:utf-8
import os
import sys
import re
import operator
def A_choiceselect(logfile,log):
listfile = os.listdir(logfile)
action = ['1.Ip List','2.Sql Injection List','3.Filter specific IP','4.Address List','5.XSS List','6.Scanner List','7.File Include List','8.All Type List','0.Exit']
while True:
print 'Choices:'
for actionnum in action:
print actionnum
selectaction = raw_input('please select:')
if selectaction == '1':
iplistresultname = raw_input('Please enter a file name as the result:')
if iplistresultname == '':
iplistresultname = 'Ip_List.txt'#默认文件名
print 'You didn\'t enter any information.So the result will save in Ip List.txt'
print 'Please don\'t interrupt.'
B_iplist(logfile,log,iplistresultname,listfile) #ip统计
print 'Done.\n'
elif selectaction == '2':
sqlinjectlistresultname = raw_input('Please enter a file name as the result:')
if sqlinjectlistresultname == '':
sqlinjectlistresultname = 'Sql_Injection_List.txt'#默认文件名
print 'You didn\'t enter any information.So the result will save in Sql_Injection_List.txt'
print 'Please don\'t interrupt.'
B_SQLlist(logfile,sqlinjectlistresultname,listfile) #sql注入统计
print 'Done.\n'
elif selectaction == '3':
ipactionprintoutresultname = raw_input('Please enter a file name as the result:')
if ipactionprintoutresultname == '':
ipactionprintoutresultname = 'Filter_specific_IP.txt'#默认文件名
print 'You didn\'t enter any information.So the result will save in Filter_specific_IP.txt'
ipaddress = raw_input('Please enter an ipaddress as the printout:')
print 'Please don\'t interrupt.'
B_ipactionprintout(logfile,ipactionprintoutresultname,listfile,ipaddress) #特定ip筛选
print 'Done.\n'
elif selectaction == '4':
addresslistresultname = raw_input('Please enter a file name as the result:')
if addresslistresultname == '':
addresslistresultname = 'Address_List.txt'#默认文件名
print 'You didn\'t enter any information.So the result will save in Address_List.txt'
print 'Please don\'t interrupt.'
B_Addresslist(logfile,log,addresslistresultname,listfile) #address统计
print 'Done.\n'
elif selectaction == '5':
xsslistresultname = raw_input('Please enter a file name as the result:')
if xsslistresultname == '':
xsslistresultname = 'XSS_List.txt'#默认文件名
print 'You didn\'t enter any information.So the result will save in XSS_List.txt'
print 'Please don\'t interrupt.'
B_XSSlist(logfile,xsslistresultname,listfile) #xss统计
print 'Done.\n'
elif selectaction == '6':
scannerlistresultname = raw_input('Please enter a file name as the result:')
if scannerlistresultname == '':
scannerlistresultname = 'Scanner_List.txt'#默认文件名
print 'You didn\'t enter any information.So the result will save in Scanner_List.txt'
print 'Please don\'t interrupt.'
B_Scannerlist(logfile,scannerlistresultname,listfile) #scanner统计
print 'Done.\n'
elif selectaction == '7':
fileincludelistresultname = raw_input('Please enter a file name as the result:')
if fileincludelistresultname == '':
fileincludelistresultname = 'File_Include List.txt'#默认文件名
print 'You didn\'t enter any information.So the result will save in File_Include_List.txt'
print 'Please don\'t interrupt.'
B_Fileincludelist(logfile,fileincludelistresultname,listfile) #文件包含统计
print 'Done.\n'
elif selectaction == '8':
iplistresultname = 'Ip_List.txt'#默认文件名
sqlinjectlistresultname = 'Sql_Injection_List.txt'#默认文件名
addresslistresultname = 'Address_List.txt'#默认文件名
xsslistresultname = 'XSS_List.txt'#默认文件名
scannerlistresultname = 'Scanner_List.txt'#默认文件名
fileincludelistresultname = 'File_Include_List.txt'#默认文件名
print 'Please don\'t interrupt.'
B_iplist(logfile,log,iplistresultname,listfile)
B_SQLlist(logfile,sqlinjectlistresultname,listfile)
B_Addresslist(logfile,log,addresslistresultname,listfile)
B_XSSlist(logfile,xsslistresultname,listfile)
B_Scannerlist(logfile,scannerlistresultname,listfile)
B_Fileincludelist(logfile,fileincludelistresultname,listfile)
#多类型筛选和统计
print 'Done.\n'
elif selectaction == '0':
break
else:
print 'The input is invalid'
def B_iplist(logfile,log,iplistresultname,listfile):
#以下为抽取字段中的ip信息
c = open(log,'w+')
fullanswer = ''
for logfilename in range(0,len(listfile)):
with open(logfile + listfile[logfilename],'r') as file_to_read:
while True:
lines = file_to_read.readline()
comp = re.compile(ur'\S+\s+\S+\s+\S+\s\S+\s\S+\s\S+\s\S+\s\S+\s(\S+)\s\S+\s\S+\s\S+\s\S+\s\S+\s')
answer = comp.findall(lines)
for k in range(0,len(answer)):
fullanswer = str(answer[k]).replace('\'','').replace(',','.').replace('(','').replace(')','').replace(' ','')
#print fullanswer,k
if not lines:
break
pass
if fullanswer <> '' and fullanswer <> 'cs-username':
c.write(fullanswer+'\n')
c.close()
C_count(log,iplistresultname)
def B_SQLlist(logfile,sqlinjectlistresultname,listfile):
#以下为筛选SQL注入语句
c = open(sqlinjectlistresultname,'w+')
fullanswer = ''
for logfilename in range(0,len(listfile)):
with open(logfile + listfile[logfilename],'r') as file_to_read:
while True:
lines = file_to_read.readline()
comp = "%20select%20|%20and%201=1|%20and%201=2|%20exec|%27exec| information_schema.tables|%20information_schema.tables|%20where%20|%20union%20|%20SELECT%20|%2ctable_name%20|cmdshell|%20table_schema" #自行修改匹配规则
answer = re.findall(comp,lines)
if len(answer) > 0:
fullanswer = answer[0]
if not lines:
break
pass
if fullanswer <> '' and fullanswer <> 'cs-username':
c.write(lines)
fullanswer = ''
c.close()
def B_ipactionprintout(logfile,ipactionprintoutresultname,listfile,ipaddress):
#以下为筛选特定的ip信息
c = open(ipactionprintoutresultname,'w+')
fullanswer = ''
list = os.listdir(logfile)
for logfilename in range(0,len(list)):
with open(logfile + listfile[logfilename],'r') as file_to_read:
while True:
lines = file_to_read.readline()
comp = re.compile(ur'\S+\s+\S+\s+\S+\s\S+\s\S+\s\S+\s\S+\s\S+\s(\S+)\s\S+\s\S+\s\S+\s\S+\s\S+\s')
answer = comp.findall(lines)
if not lines:
break
pass
if len(answer) > 0:
if answer[0] == ipaddress:
c.write(lines)
c.close()
def B_Addresslist(logfile,log,addresslistresultname,listfile):
#以下为访问路径筛选
c = open(log,'w+')
fullanswer = ''
for logfilename in range(0,len(listfile)):
with open(logfile + listfile[logfilename],'r') as file_to_read:
while True:
lines = file_to_read.readline()
comp = re.compile(ur'\S+\s+\S+\s+\S+\s\S+\s(\S+)\s\S+\s\S+\s\S+\s\S+\s\S+\s\S+\s\S+\s\S+\s\S+\s')
answer = comp.findall(lines)
for k in range(0,len(answer)):
fullanswer = str(answer[k]).replace('\'','').replace(',','.').replace('(','').replace(')','').replace(' ','')
#print fullanswer,k
if not lines:
break
pass
if fullanswer <> '' and fullanswer <> 'cs-username':
c.write(fullanswer+'\n')
C_count(log,addresslistresultname)
def B_XSSlist(logfile,xsslistresultname,listfile):
#以下为XSS筛选
c = open(xsslistresultname,'w+')
fullanswer = ''
for logfilename in range(0,len(listfile)):
with open(logfile + listfile[logfilename],'r') as file_to_read:
while True:
lines = file_to_read.readline()
comp = "%3C|%3c|%3E|%3e|%253c|%253C|%253E|%253e|alert|confirm|prompt|document.cookie|%3cscript|javascript|window.open|document.write|xss" #自行修改匹配规则
answer = re.findall(comp,lines)
if len(answer) > 0:
fullanswer = answer[0]
if not lines:
break
pass
if fullanswer <> '' and fullanswer <> 'cs-username':
c.write(lines)
fullanswer = ''
c.close()
def B_Scannerlist(logfile,scannerlistresultname,listfile):
#以下为扫描器扫描筛选
c = open(scannerlistresultname,'w+')
fullanswer = ''
for logfilename in range(0,len(listfile)):
with open(logfile + listfile[logfilename],'r') as file_to_read:
while True:
lines = file_to_read.readline()
comp = "sqlmap|acunetix|Netsparker|nmap|wvs|Appscan|Webinspect|Rsas|Nessus|WebReaver" #自行修改匹配规则
answer = re.findall(comp,lines)
if len(answer) > 0:
fullanswer = answer[0]
if not lines:
break
pass
if fullanswer <> '' and fullanswer <> 'cs-username':
c.write(lines)
fullanswer = ''
c.close()
def B_Fileincludelist(logfile,fileincludelistresultname,listfile):
#以下文件包含筛选
c = open(fileincludelistresultname,'w+')
fullanswer = ''
list = os.listdir(logfile)
for logfilename in range(0,len(listfile)):
with open(logfile + listfile[logfilename],'r') as file_to_read:
while True:
lines = file_to_read.readline()
comp = "/passwd|%00|/win.ini|/my.ini|/MetaBase.xml|/ServUDaemon.ini|/shadow"
answer = re.findall(comp,lines)
if len(answer) > 0:
fullanswer = answer[0]
if not lines:
break
pass
if fullanswer <> '' and fullanswer <> 'cs-username':
comp = re.compile(ur'\S+\s+\S+\s+\S+\s\S+\s(\S+)\s\S+\s\S+\s\S+\s\S+\s\S+\s\S+\s\S+\s\S+\s\S+\s')
answer = comp.findall(lines)
for k in range(0,len(answer)):
fullanswer = str(answer[k]).replace('\'','').replace(',','.').replace('(','').replace(')','').replace(' ','')
c.write(fullanswer+'\n')
fullanswer = ''
c.close()
def C_count(log,listresultname):
#以下为统计次数
c = open(log,'r')
m = open(listresultname,'w+')
count_dict = {}
for line in c.readlines():
line = line.strip()
count = count_dict.setdefault(line, 0)
count += 1
count_dict[line] = count
sorted_count_dict = sorted(count_dict.iteritems(), key=operator.itemgetter(1), reverse=True)
for item in sorted_count_dict:
m.write(item[0]+' '+str(item[1])+'\n')
if __name__ == '__main__':
print 'For example: E:\\test\\'
logfile = raw_input("please enter your log file dictionary:") #输入日志目录
log = 'listfile.txt' #保存过程文件,用于处理中间的数据信息,可无视
A_choiceselect(logfile,log)
os.remove(log)
五、备注
我把代码已经放到github上,如果有需要的可以下载。
https://github.com/sli-ant/-IIS-Log-Analysis 新手起步,多多包涵。