[转帖]goofys_Azure

 Goofys is a high-performance, POSIX-ish Amazon S3 file system written in Go

   

Overview

Goofys allows you to mount an S3 bucket as a filey system.

It's a Filey System instead of a File System because goofys strives for performance first and POSIX second. Particularly things that are difficult to support on S3 or would translate into more than one round-trip would either fail (random writes) or faked (no per-file permission). Goofys does not have an on disk data cache (checkout catfs), and consistency model is close-to-open.

Installation

  • On Linux, install via pre-built binaries. You may also need to install fuse too if you want to mount it on startup.
  • On macOS, install via Homebrew:
$ brew cask install osxfuse
$ brew install goofys
  • Or build from source with Go 1.10 or later:
$ export GOPATH=$HOME/work
$ go get github.com/kahing/goofys
$ go install github.com/kahing/goofys

Usage

$ cat ~/.aws/credentials
[default]
aws_access_key_id = AKID1234567890
aws_secret_access_key = MY-SECRET-KEY
$ $GOPATH/bin/goofys <bucket> <mountpoint>
$ $GOPATH/bin/goofys <bucket:prefix> <mountpoint> # if you only want to mount objects under a prefix

Users can also configure credentials via the AWS CLI or the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables.

To mount an S3 bucket on startup, make sure the credential is configured for root, and can add this to /etc/fstab:

goofys#bucket   /mnt/mountpoint        fuse     _netdev,allow_other,--file-mode=0666,--dir-mode=0777    0       0

See also: Instruction for Azure Blob Storage, Azure Data Lake Gen1, and Azure Data Lake Gen2.

Got more questions? Check out questions other people asked

Benchmark

Using --stat-cache-ttl 1s --type-cache-ttl 1s for goofys -ostat_cache_expire=1 for s3fs to simulate cold runs. Detail for the benchmark can be found in bench.shRaw data is available as well. The test was run on an EC2 m5.4xlarge in us-west-2a connected to a bucket in us-west-2. Units are seconds.

[转帖]goofys_Azure_02

To run the benchmark, configure EC2's instance role to be able to write to $TESTBUCKET, and then do:

$ sudo docker run -e BUCKET=$TESTBUCKET -e CACHE=false --rm --privileged --net=host -v /tmp/cache:/tmp/cache kahing/goofys-bench
# result will be written to $TESTBUCKET

See also: cached benchmark result and result on Azure.

License

Copyright (C) 2015 - 2019 Ka-Hing Cheung

Licensed under the Apache License, Version 2.0

Current Status

goofys has been tested under Linux and macOS.

List of non-POSIX behaviors/limitations:

  • only sequential writes supported
  • does not store file mode/owner/group
  • use --(dir|file)-mode or --(uid|gid) options
  • does not support symlink or hardlink
  • ctimeatime is always the same as mtime
  • cannot rename directories with more than 1000 children
  • unlink returns success even if file is not present
  • fsync is ignored, files are only flushed on close

Compatibility with non-AWS S3

goofys has been tested with the following non-AWS S3 providers:

  • Amplidata / WD ActiveScale
  • Ceph (ex: Digital Ocean Spaces, DreamObjects, gridscale)
  • EdgeFS
  • EMC Atmos
  • Google Cloud Storage
  • Minio (limited)
  • OpenStack Swift
  • S3Proxy
  • Scaleway
  • Wasabi

Additionally, goofys also works with the following non-S3 object stores:

  • Azure Blob Storage
  • Azure Data Lake Gen1
  • Azure Data Lake Gen2

References

  • catfs: caching layer that can be used with goofys
  • s3fs: another popular filesystem for S3
  • gcsfuse: filesystem for Google Cloud Storage. Goofys borrowed some skeleton code from this project.