封装流解决大excel分片上传cos

  • BufferedMultipleOutputStream 解决分布读取流
  • 背景
  • 一 方案一,失败
  • 二 使用sxss+stream分片,不推荐
  • 三 方案三,sxss+ 缓冲流 + cos分片


BufferedMultipleOutputStream 解决分布读取流

背景

在toB的业务开发中,excel的生成与上传、下载是不可避免的。

  1. excel的生成一般是poi或者进行封装的
  2. 查询的数据分页查询
  3. 然后写入sheetwook中
  4. 拿个outputStream(输出到内存)进行输出,然后转换成inputStream,
  5. cos上传
    在数据量小的时候没有问题,但是一旦数据量超过一万,oom不可避免的来了,提供服务的机器性能不加,但是不会再给提高配置,所以如果安全高效、节省空间的进行文件的下载是个问题。

一 方案一,失败

  1. 分页查询数据、分批写入到poi的sheetwook中,这个不难
  2. 第一次猜想使用分批写入流,分次flush到流中,如果流的大小超过了1m(cos要求分片上传的每片最小为1m),则上传;然后在进行读取数据,写入sheetwook,继续刷入流;直到数据查完。
    问题:hutool在分批刷入(flush)流的时候报错了,流已经关闭,查看源码,poi在flush的时候关闭了流,各种方法都不管用了。为了解决问题还使用了sxss,但是也是有问题。这里不再写入源码,因为都是错误的。

二 使用sxss+stream分片,不推荐

  1. 接上边的步骤,将数据分页查询出来写入sheetwook;这个使用使用sxss使用的硬盘临时文件的方式,占用内存较小(这种方式对excel有些不友好,但是纯导出不影响)。
  2. 直接一次性刷入outputstream
  3. 然后将流分成若干个均等大小的流
  4. 没生成一个流去分片上传一次,这样就解决了分片上传的问题。
    问题:这个方案,解决了分片的问题,解决了outputStream转inputstream的流过大的问题;但是缺憾是ouputStream流必须一个整流,这个流可能会很大。
InitiateMultipartUploadResult initiateMultipartUploadResult = tencentCOSService.initiateMultipartUpload(key);
               logger.info("{}完成分片上传c操作{}",key,"initiateMultipartUpload");
               List<PartETag> partETagList = Lists.newArrayList();
               int partNumber = 1;
              while (i <= (l / _M)) {
                   ByteArrayInputStream byteArrayInputStream = new ByteArrayInputStream(bytes, i * _M, _M);
                   
                     UploadPartResult uploadPartResult = tencentCOSService.uploadPart(key, initiateMultipartUploadResult.getUploadId(), partNumber, bis.available(), byteArrayInputStream);
                       partETagList.add(uploadPartResult.getPartETag());
                       logger.info("{}完成分片上传c操作{},partNumber={}",key,"uploadPart",partNumber);
                       partNumber++;

                   i++;
             }
               CompleteMultipartUploadResult completeMultipartUploadResult = tencentCOSService.completeMultipartUpload(key, initiateMultipartUploadResult.getUploadId(), partETagList);
               logger.info("{}完成分片上传c操作{}",key,"completeMultipartUpload");
               url = tencentCOSService.getFileUrl(completeMultipartUploadResult);

三 方案三,sxss+ 缓冲流 + cos分片

自定义流,在writer.flush(outputstream);的outputStream做文章,为了解决这个问题,重新看流的源码与笔记,找到了BufferedOutputStream,简直是救星,当然还不是完全的适配,所以继承了这个流,重新new一个stream。
直接上代码

BufferedMultipleOutputStream
思路是利用缓冲区的,每次缓冲区满了就new一个新的流,继续写,旧的流拿去处理,所以缓冲区可以设置为1024*1024,这样缓冲区满了生成的流就可以直接拿去上传cos。

public class BufferedMultipleOutputStream extends BufferedOutputStream {

    private IStreamHandler iStreamHandler;
    // 即会有多少个数据流
    private int pushCount=0;
    // 缓存1兆  每个输出流都这么大,写满一个 在新建一个
    public static int bufferSize = 1024 * 1024;

    public BufferedMultipleOutputStream(OutputStream out, IStreamHandler iStreamHandler) {
        super(out, bufferSize);
        this.iStreamHandler = iStreamHandler;

    }
    private OutputStream nextOutputStream() throws IOException {
        out.flush();
        OutputStream outputStream = null;
        Class class_ = out.getClass();
        iStreamHandler.handler(out,pushCount);
        pushCount++;
        try {
            System.out.println("class_=" + class_);
            outputStream = (OutputStream) class_.newInstance();
        } catch (InstantiationException | IllegalAccessException e) {
            e.printStackTrace();
        }
        return outputStream;
    }

    /**
     * Flush the internal buffer
     */
    private void flushBuffer() throws IOException {
        if (count > 0) {
            out.write(buf, 0, count);
            count = 0;
        }
    }



    public synchronized void write(int b) throws IOException {
        if (count >= buf.length) {
            flushBuffer();
            out = nextOutputStream();
        }
        buf[count++] = (byte) b;
    }



    public synchronized void write(byte b[], int off, int len) throws IOException {



        int finalOffset = 0;
        while (len > finalOffset) {
            int offset =Math.min (len-off,bufferSize - count);
            System.arraycopy(b, off, buf, count, offset);
            finalOffset = finalOffset + offset;
            off = off + offset;
            count= count+offset;
            if(count == bufferSize){
                flushBuffer();
                out = nextOutputStream();
            }
        }
    }


    public synchronized void flush() throws IOException {
       //正常读写

    }

// 因为flush对调用多次所以需要在使用方手动调用lastFlush
    public synchronized void lastFlush() throws IOException {
        // 最后处理
        flushBuffer(); // 最后的缓存写入流
        out.flush();
        iStreamHandler.handler(out,pushCount);

    }

IStreamHandler

public interface IStreamHandler {
    void handler(OutputStream outputStream,int count);
    void submitCos();

    String getFileUlr();
}

CosIStreamHandler

public class CosIStreamHandler implements IStreamHandler {

    private final Logger logger = LoggerFactory.getLogger(CosIStreamHandler.class);

    private String key;
    private ITencentCOSService tencentCOSService;
    private InitiateMultipartUploadResult initiateMultipartUploadResult;
    private List<PartETag> partETags = Lists.newArrayList();
    private String uploadId = null;
    private String fileUrl;

    public CosIStreamHandler(String key, ITencentCOSService tencentCOSService) {
        this.key = key;
        this.tencentCOSService = tencentCOSService;
    }

    @Override
    public void handler(OutputStream outputStream, int count) {
        InputStream inputStream = StreamUtils.parse(outputStream);
        int available = 0;
        try {
            available = inputStream.available();
        } catch (IOException e) {
            e.printStackTrace();
        }
        try {
            if (count == 0) {
                logger.debug("initiateMultipartUpload cos开始,key={}", key);
                initiateMultipartUploadResult = tencentCOSService.initiateMultipartUpload(key);
                uploadId = initiateMultipartUploadResult.getUploadId();
                logger.debug("initiateMultipartUpload cos结束,uploadId={},key={}", uploadId, key);
            }

            if (StringUtils.isBlank(uploadId)) {
                return;
            }
            int partNumber = count + 1;
            logger.debug("上传cos开始,key={},partNumber={},inputStream.available()={},uploadId={}", key, partNumber, available, uploadId);
            UploadPartResult uploadPartResult = tencentCOSService.uploadPart(key, uploadId, partNumber, available, inputStream);
            logger.debug("上传cos开始,key={},partNumber={},inputStream.available()={},uploadId={},UploadPartResult={}", key, partNumber, available, uploadId, JSON.toJSONString(uploadPartResult));

            partETags.add(new PartETag(uploadPartResult.getPartNumber(), uploadPartResult.getETag()));
        } catch (Exception e) {
            logger.info("分片调用cos失败,key={}", key, e);
            tencentCOSService.abortMultipartUpload(key, uploadId);
            throw new BusinessErrorException("分片调用cos失败,key=" + key);
        } finally {
            IoUtil.close(outputStream);
            IoUtil.close(inputStream);
        }
    }


// 因为不能确认那个最后一页所以需要手动提交
    @Override
    public void submitCos() {
        logger.debug("completeMultipartUpload cos开始,key={}", key);
        CompleteMultipartUploadResult completeMultipartUploadResult = tencentCOSService.completeMultipartUpload(key, uploadId, partETags);
        fileUrl = tencentCOSService.getFileUrl(completeMultipartUploadResult);
        logger.debug("completeMultipartUpload cos结束,key={},result ={},fileUrl={}", key, JSON.toJSONString(completeMultipartUploadResult), fileUrl);

    }

    @Override
    public String getFileUlr() {
        return fileUrl;
    }

使用伪码

OutputStream oututStream = new ByteArrayOutputStream();

        IStreamHandler iStreamHandler = new CosIStreamHandler(key, cosServierBuilder.routeService(context));
        BufferedMultipleOutputStream bufferedMultipleOutputStream = new BufferedMultipleOutputStream(oututStream, iStreamHandler);
        try {

            // 查询数据
            for (int i = 0; i < pageTotal; i++) {
                HutoolUtil.listToExcleBigDataOneStream(writer, handlerDataExportService, headerDTO);
            }
            writer.flush(bufferedMultipleOutputStream);
            // cos end
            bufferedMultipleOutputStream.lastFlush();
            iStreamHandler.submitCos();
        } catch (Exception e) {
            logger.error("cos分片任务失败", e);
            throw new BusinessErrorException("cos分片任务失败");
        } finally {
            IoUtil.close(bufferedMultipleOutputStream);
            IoUtil.close(writer);
        }

https://github.com/dawuti/wuti-common-code
虽然没有解决根本问题,但是这种方式(sxss + 封装流 + cos分片上传),是内存最低的。