scrapy日志监控 python日志监控脚本

转载

数据小香 2024-05-17 02:52:48

文章标签 scrapy日志监控 python 安全经验分享 github 文章分类 运维

python日志分析脚本

一、概述

当客户没有IPS、日志分析系统，又要求做日志分析时，还不想花钱使用网上需要付费的日志分析系统时，就只能手工进行日志分析，但其他免费的日志分析工具又用不习惯时，这时候可以自己编写脚本方便日志的梳理和整理，然后再利用notepad++等其他文本编辑器来对日志进行分析。

二、脚本结构

我已经编写好一个用于处理IIS日志的脚本，将脚本分成了四个部分：

1.主函数

2.用户交互函数

3.匹配函数

4.统计函数

scrapy日志监控 python日志监控脚本_github

三、脚本分析

1.主函数

仅仅是为了让用户输入日志路径和调用用户交互函数，并且新增用于处理日志信息的文件。

2.用户交互函数

这里我使用的是一种交互式的调用，需要用户一步一步进行操作。

①首先先选择梳理类型。

action = ['1.Ip List','2.Sql Injection List','3.Filter specific IP','4.Address List','5.XSS List','6.Scanner List','7.File Include List','8.All Type List','0.Exit']

scrapy日志监控 python日志监控脚本_python_02

有9个选择：

（1）Ip List

这个用于梳理访问IP的，会调用匹配函数和统计函数，最终能获得类似这样的文件。

第一个参数是ip，第二个参数是访问次数。

scrapy日志监控 python日志监控脚本_经验分享_03

如果想要整理成一个图表就可以使用excel根据空格进行分割，这个清楚excel的都懂。

（2）Sql Injection List

这个用于将SQL注入攻击的日志记录梳理出来，会调用匹配函数。

但匹配规则可能不够完善，如果各位喜欢这个脚本的话可能会继续更新。

（3）Filter specific IP

这个用于将用户输入的ip的日志记录梳理出来，会调用匹配函数，需要用户输入一个IP，然后脚本能将访问ip梳理出来。

scrapy日志监控 python日志监控脚本_python_04

（4）Address List

这个用于梳理访问地址的，会调用匹配函数和统计函数，最终能获得类似这样的文件。

第一个参数是address，第二个参数是访问次数。

scrapy日志监控 python日志监控脚本_python_05

（5）XSS List

这个用于将XSS攻击的日志记录梳理出来，会调用匹配函数。

匹配规则可能不够完善，如果各位喜欢这个脚本的话可能会继续更新。

（6）Scanner List

这个用于将扫描器的扫描日志记录梳理出来，会调用匹配函数。

匹配规则可能不够完善，如果各位喜欢这个脚本的话可能会继续更新。

（7）File Include List

这个用于将文件包含的日志记录梳理出来，会调用匹配函数。

匹配规则可能不够完善，如果各位喜欢这个脚本的话可能会继续更新。

（8）All Type List

这个用于将以上除了Filter specific IP的都梳理一遍，这个功能可能用的会比较多，能一次性把扫描都过一遍，但文件名会根据默认的进行新增。

（0）Exit

这个脚本我使用的是不断循环的方式，因此关闭的话可以输入0

②在用户输入完梳理的选项后就开始需要用户输入输出的文件名，如果没有输入文件名，便会以默认文件名进行保存。

3.匹配函数

ip和address匹配规则是根据IIS的字段匹配的

#Fields: date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs(User-Agent) cs(Referer) sc-status sc-substatus sc-win32-status time-taken

漏洞类型的则是根据整个字段进行匹配
以下是各个匹配规则：
（1）Ip List

comp = re.compile(ur'\S+\s+\S+\s+\S+\s\S+\s\S+\s\S+\s\S+\s\S+\s(\S+)\s\S+\s\S+\s\S+\s\S+\s\S+\s')

（2）Sql Injection List

comp = "%20select%20|%20and%201=1|%20and%201=2|%20exec|%27exec| information_schema.tables|%20information_schema.tables|%20where%20|%20union%20|%20SELECT%20|%2ctable_name%20|cmdshell|%20table_schema"

（3）Filter specific IP

comp = re.compile(ur'\S+\s+\S+\s+\S+\s\S+\s\S+\s\S+\s\S+\s\S+\s(\S+)\s\S+\s\S+\s\S+\s\S+\s\S+\s')

（4）Address List

comp = re.compile(ur'\S+\s+\S+\s+\S+\s\S+\s(\S+)\s\S+\s\S+\s\S+\s\S+\s\S+\s\S+\s\S+\s\S+\s\S+\s')

（5）XSS List

comp = "%3C|%3c|%3E|%3e|%253c|%253C|%253E|%253e|alert|confirm|prompt|document.cookie|%3cscript|javascript|window.open|document.write|xss"

（6）Scanner List

comp = "sqlmap|acunetix|Netsparker|nmap|wvs|Appscan|Webinspect|Rsas|Nessus|WebReaver"

（7）File Include List

comp = "/passwd|%00|/win.ini|/my.ini|/MetaBase.xml|/ServUDaemon.ini|/shadow"

4.统计函数
这个函数需要用到一个log.txt进行分析，匹配函数中IP和Address的梳理会调用到该函数，这个函数仅仅用于统计他们出现的字数，并且从高到低进行排序。
四、脚本代码

# coding:utf-8

import os
import sys
import re
import operator

def A_choiceselect(logfile,log):
    listfile = os.listdir(logfile)
    action = ['1.Ip List','2.Sql Injection List','3.Filter specific IP','4.Address List','5.XSS List','6.Scanner List','7.File Include List','8.All Type List','0.Exit']
    while True:
        print 'Choices:'
        for actionnum in action:
            print actionnum
        selectaction = raw_input('please select:')
        if selectaction == '1':
            iplistresultname = raw_input('Please enter a file name as the result:')
            if iplistresultname == '':
                iplistresultname = 'Ip_List.txt'#默认文件名
                print 'You didn\'t enter any information.So the result will save in Ip List.txt'
            print 'Please don\'t interrupt.'
            B_iplist(logfile,log,iplistresultname,listfile)     #ip统计
            print 'Done.\n'
        elif selectaction == '2':
            sqlinjectlistresultname = raw_input('Please enter a file name as the result:')
            if sqlinjectlistresultname == '':
                sqlinjectlistresultname = 'Sql_Injection_List.txt'#默认文件名
                print 'You didn\'t enter any information.So the result will save in Sql_Injection_List.txt'
            print 'Please don\'t interrupt.'
            B_SQLlist(logfile,sqlinjectlistresultname,listfile)     #sql注入统计
            print 'Done.\n'
        elif selectaction == '3':
            ipactionprintoutresultname = raw_input('Please enter a file name as the result:')
            if ipactionprintoutresultname == '':
                ipactionprintoutresultname = 'Filter_specific_IP.txt'#默认文件名
                print 'You didn\'t enter any information.So the result will save in Filter_specific_IP.txt'
            ipaddress = raw_input('Please enter an ipaddress as the printout:')
            print 'Please don\'t interrupt.'
            B_ipactionprintout(logfile,ipactionprintoutresultname,listfile,ipaddress)     #特定ip筛选
            print 'Done.\n'
        elif selectaction == '4':
            addresslistresultname = raw_input('Please enter a file name as the result:')
            if addresslistresultname == '':
                addresslistresultname = 'Address_List.txt'#默认文件名
                print 'You didn\'t enter any information.So the result will save in Address_List.txt'
            print 'Please don\'t interrupt.'
            B_Addresslist(logfile,log,addresslistresultname,listfile)     #address统计
            print 'Done.\n'
        elif selectaction == '5':
            xsslistresultname = raw_input('Please enter a file name as the result:')
            if xsslistresultname == '':
                xsslistresultname = 'XSS_List.txt'#默认文件名
                print 'You didn\'t enter any information.So the result will save in XSS_List.txt'
            print 'Please don\'t interrupt.'
            B_XSSlist(logfile,xsslistresultname,listfile)     #xss统计
            print 'Done.\n'
        elif selectaction == '6':
            scannerlistresultname = raw_input('Please enter a file name as the result:')
            if scannerlistresultname == '':
                scannerlistresultname = 'Scanner_List.txt'#默认文件名
                print 'You didn\'t enter any information.So the result will save in Scanner_List.txt'
            print 'Please don\'t interrupt.'
            B_Scannerlist(logfile,scannerlistresultname,listfile)     #scanner统计
            print 'Done.\n'
        elif selectaction == '7':
            fileincludelistresultname = raw_input('Please enter a file name as the result:')
            if fileincludelistresultname == '':
                fileincludelistresultname = 'File_Include List.txt'#默认文件名
                print 'You didn\'t enter any information.So the result will save in File_Include_List.txt'
            print 'Please don\'t interrupt.'
            B_Fileincludelist(logfile,fileincludelistresultname,listfile)     #文件包含统计
            print 'Done.\n'
        elif selectaction == '8':
            iplistresultname = 'Ip_List.txt'#默认文件名
            sqlinjectlistresultname = 'Sql_Injection_List.txt'#默认文件名
            addresslistresultname = 'Address_List.txt'#默认文件名
            xsslistresultname = 'XSS_List.txt'#默认文件名
            scannerlistresultname = 'Scanner_List.txt'#默认文件名
            fileincludelistresultname = 'File_Include_List.txt'#默认文件名
            print 'Please don\'t interrupt.'
            B_iplist(logfile,log,iplistresultname,listfile)
            B_SQLlist(logfile,sqlinjectlistresultname,listfile)
            B_Addresslist(logfile,log,addresslistresultname,listfile)
            B_XSSlist(logfile,xsslistresultname,listfile)
            B_Scannerlist(logfile,scannerlistresultname,listfile)
            B_Fileincludelist(logfile,fileincludelistresultname,listfile)
            #多类型筛选和统计
            print 'Done.\n'
        elif selectaction == '0':
            break
        else:
            print 'The input is invalid' 

def B_iplist(logfile,log,iplistresultname,listfile):
    #以下为抽取字段中的ip信息
    c = open(log,'w+')
    fullanswer = ''
    for logfilename in range(0,len(listfile)):
        with open(logfile + listfile[logfilename],'r') as file_to_read:
            while True:
                lines = file_to_read.readline()
                comp = re.compile(ur'\S+\s+\S+\s+\S+\s\S+\s\S+\s\S+\s\S+\s\S+\s(\S+)\s\S+\s\S+\s\S+\s\S+\s\S+\s')
                answer = comp.findall(lines)
                for k in range(0,len(answer)):
                    fullanswer = str(answer[k]).replace('\'','').replace(',','.').replace('(','').replace(')','').replace(' ','')
                    #print fullanswer,k
                if not lines:
                    break
                    pass
                if fullanswer <> '' and fullanswer <> 'cs-username':
                    c.write(fullanswer+'\n')
    c.close()
    C_count(log,iplistresultname)

def B_SQLlist(logfile,sqlinjectlistresultname,listfile):
    #以下为筛选SQL注入语句
    c = open(sqlinjectlistresultname,'w+')
    fullanswer = ''
    for logfilename in range(0,len(listfile)):
        with open(logfile + listfile[logfilename],'r') as file_to_read:
            while True:
                lines = file_to_read.readline()
                comp = "%20select%20|%20and%201=1|%20and%201=2|%20exec|%27exec| information_schema.tables|%20information_schema.tables|%20where%20|%20union%20|%20SELECT%20|%2ctable_name%20|cmdshell|%20table_schema"  #自行修改匹配规则
                answer = re.findall(comp,lines)
                if len(answer) > 0:
                    fullanswer = answer[0]
                if not lines:
                    break
                    pass
                if fullanswer <> '' and fullanswer <> 'cs-username':
                    c.write(lines)
                fullanswer = ''
    c.close()

def B_ipactionprintout(logfile,ipactionprintoutresultname,listfile,ipaddress):
    #以下为筛选特定的ip信息
    c = open(ipactionprintoutresultname,'w+')
    fullanswer = ''
    list = os.listdir(logfile)
    for logfilename in range(0,len(list)):
        with open(logfile + listfile[logfilename],'r') as file_to_read:
            while True:
                lines = file_to_read.readline()
                comp = re.compile(ur'\S+\s+\S+\s+\S+\s\S+\s\S+\s\S+\s\S+\s\S+\s(\S+)\s\S+\s\S+\s\S+\s\S+\s\S+\s') 
                answer = comp.findall(lines)
                if not lines:
                    break
                    pass
                if len(answer) > 0:
                    if answer[0] == ipaddress:
                        c.write(lines)
    c.close()

def B_Addresslist(logfile,log,addresslistresultname,listfile):
    #以下为访问路径筛选
    c = open(log,'w+')
    fullanswer = ''
    for logfilename in range(0,len(listfile)):
        with open(logfile + listfile[logfilename],'r') as file_to_read:
            while True:
                lines = file_to_read.readline()
                comp = re.compile(ur'\S+\s+\S+\s+\S+\s\S+\s(\S+)\s\S+\s\S+\s\S+\s\S+\s\S+\s\S+\s\S+\s\S+\s\S+\s')
                answer = comp.findall(lines)
                for k in range(0,len(answer)):
                    fullanswer = str(answer[k]).replace('\'','').replace(',','.').replace('(','').replace(')','').replace(' ','')
                    #print fullanswer,k
                if not lines:
                    break
                    pass
                if fullanswer <> '' and fullanswer <> 'cs-username':
                    c.write(fullanswer+'\n')
    C_count(log,addresslistresultname)

def B_XSSlist(logfile,xsslistresultname,listfile):
    #以下为XSS筛选
    c = open(xsslistresultname,'w+')
    fullanswer = ''
    for logfilename in range(0,len(listfile)):
        with open(logfile + listfile[logfilename],'r') as file_to_read:
            while True:
                lines = file_to_read.readline()
                comp = "%3C|%3c|%3E|%3e|%253c|%253C|%253E|%253e|alert|confirm|prompt|document.cookie|%3cscript|javascript|window.open|document.write|xss"   #自行修改匹配规则
                answer = re.findall(comp,lines)
                if len(answer) > 0:
                    fullanswer = answer[0]
                if not lines:
                    break
                    pass
                if fullanswer <> '' and fullanswer <> 'cs-username':
                    c.write(lines)
                fullanswer = ''
    c.close()

def B_Scannerlist(logfile,scannerlistresultname,listfile):
    #以下为扫描器扫描筛选
    c = open(scannerlistresultname,'w+')
    fullanswer = ''
    for logfilename in range(0,len(listfile)):
        with open(logfile + listfile[logfilename],'r') as file_to_read:
            while True:
                lines = file_to_read.readline()
                comp = "sqlmap|acunetix|Netsparker|nmap|wvs|Appscan|Webinspect|Rsas|Nessus|WebReaver"   #自行修改匹配规则
                answer = re.findall(comp,lines)
                if len(answer) > 0:
                    fullanswer = answer[0]
                if not lines:
                    break
                    pass
                if fullanswer <> '' and fullanswer <> 'cs-username':
                    c.write(lines)
                fullanswer = ''
    c.close()

def B_Fileincludelist(logfile,fileincludelistresultname,listfile):
    #以下文件包含筛选
    c = open(fileincludelistresultname,'w+')
    fullanswer = ''
    list = os.listdir(logfile)
    for logfilename in range(0,len(listfile)):
        with open(logfile + listfile[logfilename],'r') as file_to_read:
            while True:
                lines = file_to_read.readline()
                comp = "/passwd|%00|/win.ini|/my.ini|/MetaBase.xml|/ServUDaemon.ini|/shadow"
                answer = re.findall(comp,lines)
                if len(answer) > 0:
                    fullanswer = answer[0]
                if not lines:
                    break
                    pass
                if fullanswer <> '' and fullanswer <> 'cs-username':
                    comp = re.compile(ur'\S+\s+\S+\s+\S+\s\S+\s(\S+)\s\S+\s\S+\s\S+\s\S+\s\S+\s\S+\s\S+\s\S+\s\S+\s')
                    answer = comp.findall(lines)
                    for k in range(0,len(answer)):
                        fullanswer = str(answer[k]).replace('\'','').replace(',','.').replace('(','').replace(')','').replace(' ','')
                    c.write(fullanswer+'\n')
                fullanswer = ''
    c.close()
    
def C_count(log,listresultname):
    #以下为统计次数
    c = open(log,'r')
    m = open(listresultname,'w+')
    count_dict = {}
    for line in c.readlines():
        line = line.strip()
        count = count_dict.setdefault(line, 0)
        count += 1
        count_dict[line] = count
    sorted_count_dict = sorted(count_dict.iteritems(), key=operator.itemgetter(1), reverse=True)
    for item in sorted_count_dict:
        m.write(item[0]+'  '+str(item[1])+'\n')
 
if __name__ == '__main__':
    print 'For example: E:\\test\\'
    logfile = raw_input("please enter your log file dictionary:")   #输入日志目录
    log = 'listfile.txt'  #保存过程文件，用于处理中间的数据信息，可无视
    A_choiceselect(logfile,log)
    os.remove(log)

五、备注
我把代码已经放到github上，如果有需要的可以下载。
https://github.com/sli-ant/-IIS-Log-Analysis 新手起步，多多包涵。

本文章为转载内容，我们尊重原作者对文章享有的著作权。如有内容错误或侵权问题，欢迎原作者联系我们进行内容更正或删除文章。