读取数据
一.读取csv/excel文件
方法一:pandas读取
csv文件示例:
import pandas as pd
train = pd.read_csv("E:/kaggle/House-price/train.csv")
print(train['Alley'][0])
>>na
可得到某个csv文件中Alley列0行位置出的元素值,同理可用pd.read_excel
方法
方法二:os读取
import os
with open('E:/kaggle/House-price/train.csv', encoding='utf-8') as file_obj:
contents = file_obj.read()
print(contents)
逐行读取:
import os
with open('E:/kaggle/House-price/train.csv', encoding='utf-8') as file_obj:
for line in file_obj:
print(line)
二.读取json文件
json 是 key-value结构的形式,value可为一个list
{‘key1’: 'value1', 'key2': 'value2', 'key3': 'value3'}
{'key1': ['v11', 'v12', 'v13'], 'key2':'v22'}
1.json.loads()方法
import json
with open('F:/rrc/squad/train.json',encoding='utf-8') as f:
line = f.readline()
print(type(line))
f.close()
>> <class 'str'>
由上可知从json文件每行读取出的元素是str类型,我们可以通过json.loads()方法将其转化为dict类型,方便python语言进行操作
import json
with open('F:/rrc/squad/train.json',encoding='utf-8') as f:
line = f.readline()
d = json.loads(line)
print(type(d))
f.close()
>> <class 'dict'>
当文件中的每组数据不在同一行时,该方法会发生错误,如:
json.loads()方法不能很好的读出数据,因为该方法时按行读取
2.json.load()方法
该方法读取文件中的所有数据,返回类型为dict
三.读取txt文件
1.一次性读取所有文件
with open("test.txt", "r") as f:
data = f.read()
print(data)
2.读取一行内容 (常与for循环连用)
with open("test.txt", "r") as f:
data = f.readline()
print(data)
3.读取文件所有内容,以list格式返回
with open("test.txt", "r") as f:
for line in f.readlines():
print(line)