Goofys is a high-performance, POSIX-ish Amazon S3 file system written in Go
Overview
Goofys allows you to mount an S3 bucket as a filey system.
It's a Filey System instead of a File System because goofys strives for performance first and POSIX second. Particularly things that are difficult to support on S3 or would translate into more than one round-trip would either fail (random writes) or faked (no per-file permission). Goofys does not have an on disk data cache (checkout catfs), and consistency model is close-to-open.
Installation
- On Linux, install via pre-built binaries. You may also need to install fuse too if you want to mount it on startup.
- On macOS, install via Homebrew:
$ brew cask install osxfuse
$ brew install goofys
- Or build from source with Go 1.10 or later:
$ export GOPATH=$HOME/work
$ go get github.com/kahing/goofys
$ go install github.com/kahing/goofys
Usage
$ cat ~/.aws/credentials
[default]
aws_access_key_id = AKID1234567890
aws_secret_access_key = MY-SECRET-KEY
$ $GOPATH/bin/goofys <bucket> <mountpoint>
$ $GOPATH/bin/goofys <bucket:prefix> <mountpoint> # if you only want to mount objects under a prefix
Users can also configure credentials via the AWS CLI or the AWS_ACCESS_KEY_ID
and AWS_SECRET_ACCESS_KEY
environment variables.
To mount an S3 bucket on startup, make sure the credential is configured for root
, and can add this to /etc/fstab
:
goofys#bucket /mnt/mountpoint fuse _netdev,allow_other,--file-mode=0666,--dir-mode=0777 0 0
See also: Instruction for Azure Blob Storage, Azure Data Lake Gen1, and Azure Data Lake Gen2.
Got more questions? Check out questions other people asked
Benchmark
Using --stat-cache-ttl 1s --type-cache-ttl 1s
for goofys -ostat_cache_expire=1
for s3fs to simulate cold runs. Detail for the benchmark can be found in bench.sh. Raw data is available as well. The test was run on an EC2 m5.4xlarge in us-west-2a connected to a bucket in us-west-2. Units are seconds.
To run the benchmark, configure EC2's instance role to be able to write to $TESTBUCKET
, and then do:
$ sudo docker run -e BUCKET=$TESTBUCKET -e CACHE=false --rm --privileged --net=host -v /tmp/cache:/tmp/cache kahing/goofys-bench
# result will be written to $TESTBUCKET
See also: cached benchmark result and result on Azure.
License
Copyright (C) 2015 - 2019 Ka-Hing Cheung
Licensed under the Apache License, Version 2.0
Current Status
goofys has been tested under Linux and macOS.
List of non-POSIX behaviors/limitations:
- only sequential writes supported
- does not store file mode/owner/group
- use
--(dir|file)-mode
or--(uid|gid)
options
- does not support symlink or hardlink
-
ctime
,atime
is always the same asmtime
- cannot
rename
directories with more than 1000 children -
unlink
returns success even if file is not present -
fsync
is ignored, files are only flushed onclose
Compatibility with non-AWS S3
goofys has been tested with the following non-AWS S3 providers:
- Amplidata / WD ActiveScale
- Ceph (ex: Digital Ocean Spaces, DreamObjects, gridscale)
- EdgeFS
- EMC Atmos
- Google Cloud Storage
- Minio (limited)
- OpenStack Swift
- S3Proxy
- Scaleway
- Wasabi
Additionally, goofys also works with the following non-S3 object stores:
- Azure Blob Storage
- Azure Data Lake Gen1
- Azure Data Lake Gen2
References
- Data is stored on Amazon S3
- Amazon SDK for Go
- Other related fuse filesystems
- catfs: caching layer that can be used with goofys
- s3fs: another popular filesystem for S3
- gcsfuse: filesystem for Google Cloud Storage. Goofys borrowed some skeleton code from this project.
-
S3Proxy is used for
go test
-
fuse binding, also used by
gcsfuse