有以下一个excel表格:

《五.批量数据统计和分析报表》_excel表格

里面有65353条用户信息:

《五.批量数据统计和分析报表》_用户信息_02

使用xlrd或者openpyxl框架完成一下任务

1.    

统计和分析以下问题数据:
a) 统计表格中有多少人:
b) 统计办电信,联通,移动的用户数量并计算出三种用户的占比

c) 总公司男女人数
d) 年龄超过45岁的老员工人数
e) 薪资高于8000元的高薪人员数量和薪资低于3000的底薪人员数量
f) 统计去传媒公司的工作的人员数量
g) 统计一下可能在疫情高危地区的人数(高危地区:黑龙江,北京,福建,四川)

《五.批量数据统计和分析报表》_excel表格_03


import xlrd

wb = xlrd.open_workbook(filename=r"5.baidu-员工的人员信息.xls")
st = wb.sheet_by_index(0)

print("总共:",st.nrows-1,"个人")
###############处理手机号码
count = {"yidong":0,"liantong":0,"dianxin":0}
data = st.col_values(5)[1:]
# 移动的电话规则
yidong = ["134","135","136","137","138","139","147",
          "150","151","152","157","158","159","178",
          "182","183","184","187","188","1703","1705","1706"]
# 联通的电话规则
liantong = ["130","131","132","145","155","156","175",
          "176","185","186","1704","1707","1708","1709","171"]
# 电信的电话规则
dianxin = ["133","149","153","173","177","180","181","189","1700","1701","1702"]

for i in data:
    num = ""
    if str(i).startswith("170"):  # 判断是170的号开头,截取4位开头
        num = i[:4]
    else:  # 否则截取 3位开头
        num = i[:3]

    if num in yidong:
        count["yidong"] = count["yidong"] + 1
    elif num in liantong:
        count["liantong"] = count["liantong"] + 1
    elif num in dianxin:
        count["dianxin"] = count["dianxin"] + 1

print(count)


##############统计男女

sex = st.col_values(8)
sex_count = {"female":0,"male":0}
for j in sex[1:]:
    if j == "男":
        sex_count["male"] = sex_count["male"] + 1
    else:
        sex_count["female"] = sex_count["female"] + 1

print(sex_count)

###############年龄
age = st.col_values(7)[1:]
lt45 = 0

for k in age:
    if k > 45:
        lt45 =  lt45 + 1
print(lt45)

##############薪资高于8000元的高薪人员数量和薪资低于3000的底薪人员数量
sal = st.col_values(11)[1:]
lt8 = 0
gt3 = 0

for k in sal:
    if k >= 8000:
        lt8 += 1
    elif k < 3000:
        gt3 += 1
print("大于8000的人数:",lt8,",小于3000的人:",gt3)

###############统计去传媒公司的工作的人员数量
company = st.col_values(13)[1:]
ct = 0
for row in company:
    if str.__contains__(row,"传媒"):
        ct += 1

print("传媒公司上班人数:",ct)
######################高风险地区统计: 第
dan_add = st.col_values(9)[1:]

li = ["黑龙江","北京市","福建省","四川省"]
count = {}
for i in dan_add:
    address = i[:3]
    if address in li:
        if address in count:
            count[address] +=  1
        else:
            count[address] = 1
print("高风险人数:",count)