As a cloud-native distributed storage system, JuiceFS was designed into a plug-in structure at the beginning of its birth to ensure that new technologies can be continuously integrated into the JuiceFS ecosystem. Users can flexibly choose the two core components of data storage engine and metadata engine according to their needs.
The data storage engine of JuiceFS is mainly object storage, and supports almost all public and private cloud object storage services. It also supports KV storage, WebDAV, and local disks. The metadata engine supports databases such as Redis, MySQL, PostgreSQL, and SQLite.
The newly released JuiceFS v0.16 officially supports TiKV key-value databases, which further meets the requirements for elastic scaling in high-performance, large-scale data storage.
This article will share with you how JuiceFS uses TiKV as a metadata engine.
TiKV
TiKV is a distributed transactional key-value database with high scalability, low latency, and ease of use. It has excellent performance and supports big data processing capabilities of petabytes of trillion rows of data.
In terms of design, TiKV supports unlimited horizontal expansion. Provides a distributed transaction interface that meets ACID constraints. Adopt Raft Protocol to ensure data consistency and high availability of multiple copies.
TiKV was developed by PingCAP and is one of the projects incubated by the Cloud Native Foundation (CNCF).
Install TiKV
PingCAP provides TiUP package manager, which can easily install TiKV and other products on Linux or macOS.
1. Install TiUP
$ curl --proto '=https' --tlsv1.2 -sSf https://tiup-mirrors.pingcap.com/install.sh | sh
This command will automatically detect the current system environment, download and install the appropriate version, add the program path of TiUP to the executable path of the terminal.
To make the settings take effect, you can open a new terminal. Or manually execute the command to take effect, for example, use bash:
$ source .bash_profile
Now try to execute the command tiup -v
and see the version information similar to the following, which means the installation is successful.
$ tiup -v
1.5.4 tiup
Go Version: go1.16.6
Git Ref: v1.5.4
GitHash: b629670276269cd1518eb28f362a5180135cc985
2. Deploy TiKV cluster
This article uses the playground
component provided by TiUP to install a minimal TiKV cluster for testing purposes in the local environment.
$ tiup playground --mode tikv-slim
After the deployment is successful, the terminal will display a message similar to the following:
PD client endpoints: [127.0.0.1:2379]
To view the Prometheus: http://127.0.0.1:9090
To view the Grafana: http://127.0.0.1:3000
Among them, 127.0.0.1:2379
is the Placement Driver (PD) address, which is the management node of the TiKV cluster. JuiceFS will interact with TiKV through this address. The other two addresses are Prometheus
and Grafana
services, which are used for monitoring and data visualization of TiKV clusters.
Note: The playground component of TiUP is mainly used to quickly build a minimal test cluster of TiDB and TiKV in the local environment. For production environment deployment, please refer to TiKV Official Document.
Install JuiceFS
JuiceFS supports Linux, Windows and macOS systems at the same time. You only need to download the corresponding version of the client program and place it in the executable path of the system. For example, I am currently using a Linux distribution, and I can install the latest version of the client by executing the following commands in sequence.
Check current system information and set temporary environment variables:
$ JFS_LATEST_TAG=$(curl -s https://api.github.com/repos/juicedata/juicefs/releases/latest | grep 'tag_name' | cut -d '"' -f 4 | tr -d 'v')
Download the latest version of JuiceFS client installation package adapted to the current system:
$ wget "https://github.com/juicedata/juicefs/releases/download/v${JFS_LATEST_TAG}/juicefs-${JFS_LATEST_TAG}-linux-amd64.tar.gz"
Unzip the package:
$ mkdir juice && tar -zxvf "juicefs-${JFS_LATEST_TAG}-linux-amd64.tar.gz" -C juice
Install JuiceFS client to /usr/local/bin
:
$ sudo install juice/juicefs /usr/local/bin
Execute the command and see the help information returned, which means that the client is installed successfully.
$ juicefs
NAME:
juicefs - A POSIX file system built on Redis and object storage.
USAGE:
juicefs [global options] command [command options] [arguments...]
VERSION:
0.16.1 (2021-08-16 2edcfc0)
COMMANDS:
format format a volume
mount mount a volume
umount unmount a volume
gateway S3-compatible gateway
sync sync between two storage
rmr remove directories recursively
info show internal information for paths or inodes
bench run benchmark to read/write/stat big/small files
gc collect any leaked objects
fsck Check consistency of file system
profile analyze access log
stats show runtime stats
status show status of JuiceFS
warmup build cache for target directories/files
dump dump metadata into a JSON file
load load metadata from a previously dumped JSON file
help, h Shows a list of commands or help for one command
GLOBAL OPTIONS:
--verbose, --debug, -v enable debug log (default: false)
--quiet, -q only warning and errors (default: false)
--trace enable trace log (default: false)
--no-agent Disable pprof (:6060) and gops (:6070) agent (default: false)
--help, -h show help (default: false)
--version, -V print only the version (default: false)
COPYRIGHT:
AGPLv3
In addition, you can also visit the JuiceFS GitHub Releases page to select other versions for manual installation.
Usage
Here I refer to JuiceFS Quick Start Guide, a MinIO object storage is built locally, and the access address is http://127.0.0.1:9000
, Access Key ID
and Access Key Secret
are all minioadmin
.
1. Create a file system
The following command uses the format
subcommand provided by the JuiceFS client to create a file system named mystor
, where TiKV database address format Refer to the official document setting, that is, use the PD address of the TiKV cluster:
$ juicefs format \
--storage minio \
--bucket http://127.0.0.1:9000/mystor \
--access-key minioadmin \
--secret-key minioadmin \
tikv://127.0.0.1:2379/mystor \
mystor
Parameter Description:
--storage
: Specify the data storage engine, here isminio
.--bucket
: Specify the bucket access URL. The bucket namedmystor
I created in advance on MinIO.--access-key
and--secret-key
: Specify the secret key for accessing the object storage service API.- When using TiKV to store metadata, set the PD address of the cluster. When multiple file systems or applications share the same TiKV, it is recommended to add an optional
prefix
, where the prefixmystor
is specified in the PD address.
If you see output similar to the following, it means that the file system was created successfully:
2021/08/12 23:28:36.932241 juicefs[101222] <INFO>: Meta address: tikv://127.0.0.1:2379/mystor
[2021/08/12 23:28:36.932 +08:00] [INFO] [client.go:214] ["[pd] create pd client with endpoints"] [pd-address="[127.0.0.1:2379]"]
[2021/08/12 23:28:36.935 +08:00] [INFO] [base_client.go:346] ["[pd] switch leader"] [new-leader=http://127.0.0.1:2379] [old-leader=]
[2021/08/12 23:28:36.935 +08:00] [INFO] [base_client.go:126] ["[pd] init cluster id"] [cluster-id=6995548759432331426]
[2021/08/12 23:28:36.935 +08:00] [INFO] [client.go:238] ["[pd] create tso dispatcher"] [dc-location=global]
2021/08/12 23:28:36.936892 juicefs[101222] <INFO>: Data uses minio://127.0.0.1:9000/mystor/mystor/
2021/08/12 23:28:36.976722 juicefs[101222] <INFO>: Volume is formatted as {Name:mystor UUID:0c9594a8-fe2c-463c-a4b6-eb815f38c843 Storage:minio Bucket:http://127.0.0.1:9000/mystor AccessKey:minioadmin SecretKey:removed BlockSize:4096 Compression:none Shards:0 Partitions:0 Capacity:0 Inodes:0 EncryptKey:}
2. Mount the file system
Use the mount
subcommand to mount the file system to the jfs
directory under the current user's home directory:
$ sudo juicefs mount -d tikv://127.0.0.1:2379/mystor ~/jfs
The sudo
command is used here to mount the file system as a super user. The purpose is to allow JuiceFS to normally establish and use the /var/jfsCache
directory to cache data.
If you see output similar to the following, it means that the file system is mounted successfully:
2021/08/12 23:34:44.288136 juicefs[101873] <INFO>: Meta address: tikv://127.0.0.1:2379/mystor
[2021/08/12 23:34:44.288 +08:00] [INFO] [client.go:214] ["[pd] create pd client with endpoints"] [pd-address="[127.0.0.1:2379]"]
[2021/08/12 23:34:44.291 +08:00] [INFO] [base_client.go:346] ["[pd] switch leader"] [new-leader=http://127.0.0.1:2379] [old-leader=]
[2021/08/12 23:34:44.291 +08:00] [INFO] [base_client.go:126] ["[pd] init cluster id"] [cluster-id=6995548759432331426]
[2021/08/12 23:34:44.291 +08:00] [INFO] [client.go:238] ["[pd] create tso dispatcher"] [dc-location=global]
2021/08/12 23:34:44.296270 juicefs[101873] <INFO>: Data use minio://127.0.0.1:9000/mystor/mystor/
2021/08/12 23:34:44.296768 juicefs[101873] <INFO>: Disk cache (/var/jfsCache/0c9594a8-fe2c-463c-a4b6-eb815f38c843/): capacity (1024 MB), free ratio (10%), max pending pages (15)
2021/08/12 23:34:44.800551 juicefs[101873] <INFO>: OK, mystor is ready at /home/herald/jfs
Use the df
command to see the mounting status of the file system:
$ df -Th
File system type capacity used usable used% mount point
JuiceFS:mystor fuse.juicefs 1.0P 64K 1.0P 1% /home/herald/jfs
After mounting, you can now store data in the ~/jfs
directory just like using a local hard disk.
3. View file system information
The status
subcommand of the JuiceFS client can view the basic information and connection status of a file system.
$ juicefs status tikv://127.0.0.1:2379/mystor
{
"Setting": {
"Name": "mystor",
"UUID": "9f50f373-a7ec-4d5b-b790-3defbf6d0509",
"Storage": "minio",
"Bucket": "http://127.0.0.1:9000/mystor",
"AccessKey": "minioadmin",
"SecretKey": "removed",
"BlockSize": 4096,
"Compression": "none",
"Shards": 0,
"Partitions": 0,
"Capacity": 0,
"Inodes": 0
},
"Sessions": [
{
"Sid": 2,
"Heartbeat": "2021-08-13T10:43:35+08:00",
"Version": "0.16-dev (2021-08-12 a871c3d)",
"Hostname": "herald-manjaro",
"MountPoint": "/home/herald/jfs",
"ProcessID": 6309
}
]
}
In the output information, you can learn more about the data storage engine used by a file system and the status of the host that currently mounts the file system.
In addition, v0.16 and above can also learn the detailed configuration of the file system by viewing the .config
virtual file in the root directory of the mount point:
$ sudo cat ~/jfs/.config
{
"Meta": {
"Strict": true,
"Retries": 10,
"CaseInsensi": false,
"ReadOnly": false,
"OpenCache": 0,
"MountPoint": "jfs",
"Subdir": ""
},
"Format": {
"Name": "myabc",
"UUID": "e9d8373c-7ced-49d9-a033-75f6abb44854",
"Storage": "minio",
"Bucket": "http://127.0.0.1:9000/mystor",
"AccessKey": "minioadmin",
"SecretKey": "removed",
"BlockSize": 4096,
"Compression": "none",
"Shards": 0,
"Partitions": 0,
"Capacity": 0,
"Inodes": 0
},
"Chunk": {
"CacheDir": "/var/jfsCache/e9d8373c-7ced-49d9-a033-75f6abb44854",
"CacheMode": 384,
"CacheSize": 1024,
"FreeSpace": 0.1,
"AutoCreate": true,
"Compress": "none",
"MaxUpload": 20,
"Writeback": false,
"Partitions": 0,
"BlockSize": 4194304,
"GetTimeout": 60000000000,
"PutTimeout": 60000000000,
"CacheFullBlock": true,
"BufferSize": 314572800,
"Readahead": 0,
"Prefetch": 1
},
"Version": "0.16.1 (2021-08-16 2edcfc0)",
"Mountpoint": "jfs"
}
Notice: It is important to note that this article uses a local demonstration environment. If you need to share and mount the same JuiceFS file system on multiple hosts, you need to ensure that the deployed object storage and TiKV cluster can be accessed by all hosts.
4. Unmount the file system
You can use the umount
subcommand to unmount the file system, for example:
$ sudo juicefs umount ~/jfs
Warning: Force unmount the file system in use may cause data damage or loss, please be careful to operate.
5. Mount at boot
If you don’t want to manually remount JFS every time reboot, you can set up auto mounting.
First, you need to rename the juicefs
client to mount.juicefs
and copy it to the /sbin/
directory:
$ sudo cp /usr/local/bin/juicefs /sbin/mount.juicefs
Edit the /etc/fstab
configuration file and add a new record:
tikv://127.0.0.1:2379/mystor /home/herald/jfs juicefs _netdev,cache-size=20480 0 0
In the mount option, cache-size=20480
means to allocate 20GB of local disk space as JuiceFS cache. Please decide the allocated cache size according to the actual hardware configuration. Generally speaking, to allocate more cache space for JuiceFS, you can get better performance.
You can adjust the FUSE mount options in the above configuration according to your needs.
Summarize
For JuiceFS, opening support for TiKV is a milestone. It fills in the difficulty of horizontal expansion when Redis is used as a metadata engine, and at the same time fills in the performance shortcomings of SQL databases such as MySQL and PostgreSQL, and provides users with a new choice when selecting metadata engines.
Open Source Contribution Guide
JuiceFS is an open source project under the AGPLv3, and its development is inseparable from everyone’s support. An article, a page of documentation, an idea, a suggestion, a report, or a bug fix, no matter how big or small the contribution is, it is the driving force to promote the progress of an open source project.
Things you can do for the community:
- Starring the project https://github.com/juicedata/juicefs
- Post your opinions in Forum
- Pick development tasks in Issues
- Improve the document of JuiceFS
- Share everything about JuiceFS on your blog, twitter, Vlog and other self-media.
- Join Slack channel of JuiceFS
- Tell more people about JuiceFS and let them use it.
We sincerely invite everyone who loves open source to join our community, let us make JuiceFS better together!