Gridfs和bson存储非结构化文件(Python)
原创
©著作权归作者所有:来自51CTO博客作者wx63899b601ff16的原创作品,请联系作者获取转载授权,否则将追究法律责任
文章目录
- 1.1 Gridfs
- 1.2 bson上传(<16m)
1 Gridfs和bson存储大文件
GridFS 用于存储和恢复那些超过16M(BSON文件限制)的文件(如:图片、音频、视频等),适合于不常改变但是经常需要连续访问的大文件。pymongo 利用gridfs构建大文件存储系统
1.1 Gridfs
import os
import sys
from gridfs import *
from pymongo import MongoClient
from datetime import datetime
from bson import binary
#连接数据库
client = MongoClient("localhost:27017",username="kg",password="123",authSource="kgraph")
#定义上传和下载的文件夹
up_path = "/Users/enjlife/Desktop/cv/"
do_path = "/Users/enjlife/Desktop/temp/"
#使用Gridfs获取数据库集合
db = client.kgraph
mycol = db.list_collection_names() # 查看集合列表
fs = GridFS(db, collection='pdfs')
#定义上传和下载函数
#批量上传 Gridfs
def gd_upload(fs,up_path):
for filename in os.listdir(up_path):
dic = {}
dic["fname"] = filename
# dic["上传时间"] = datetime.now()
content = open(up_path + filename,"rb").read()
fs.put(content,**dic)
def gd_download(fs,do_path):
for cursor in fs.find():
name = cursor.fname
content = cursor.read()
with open(do_path + name,"wb") as f:
f.write(content)
mycol = db.list_collection_names() # 查看集合列表,Gridfs会分两部分上传
#['pdfs', 'pdfs.files', 'pdfs.chunks']
#查看结果
col = db.pdfs.files
for x in col.find():
print(x)
1.2 bson上传(<16m)
接上面连接数据库获取的col
col = db.pdfs
def bs_upload(col,uppath):
dic = {}
dic["fname"] = "FastR-CNN.pdf"
data = open(up_path + "FastR-CNN.pdf","rb").read()
if not col.find_one({"fname":"FastR-CNN.pdf"}):
dic["bdata"] = binary.Binary(data)
col.insert(dic)
python利用mongodb上传图片数据 : GridFS 与 bson两种方式python下上传/下载各种格式文件到MongoDB数据库中