How to Use S3 compatible Minio with Cloudberry
Minio cloudberrylab
Table of contents:
Here at CloudBerry, where one of the goal I have is to help partners decide, pick and build right configuration for dealing with computers data when they need backup solution. FTP, SCP, WebDav and some other proprietary protocols have been here forever, where simplicity and flexibility made them number one in data protection and management fields as primary target configurations. But, there is big BUT! Time flies, we can afford tens or even hundreds mbps bi-direction circuits, where target configuration apart disk IOPs may become primary bottleneck. I would consider above protocols as legacy since they have number of limitations and slow due to architecture of data transfers. And this is where something else come up. Minio gets more and more scores as part of seamless storage system for self-hosted configurations. Quick example where you would follow this guide, — you offer backup and DR services, you have bunch of unused disks (JBOD), few NASes and huge spot of free space on one of your legendary legacy server, which is still alive and you can’t just throw it away since it is still powerful and can do the job.
What is CloudBerry Lab #
Since 2011 CloudBerry Lab offers Backup and File Management tools for IT pros and computer users. With two major product lines (Standalone and Managed Backup Service aka “MBS” in short) company offers tools for cross-platform (Windows, Linux and Mac) data protection (automated backup) and easy-to-use file management tools (CloudBerry Explorer) with major focus on modern cloud storage providers and technologies. MBS is SaaS designed to simplify daily routine of IT departments and service providers (partners with services for other companies) by bringing backup offerings to the next level. With ease of use, affordable price and flexible agent options MBS comes with web UI fully and 24x7x365 available system managed by CloudBerry DevOps team. There are some other products like Remote Assistance, CloudBerry Drive, native NAS (QNAP and Synology) applications, Data De-duplication server and G Suite / Office 365 backup tools (both based on native API methods from Google and Microsoft respectfully).
What is Minio #
Minio is an object storage service, which is fully compatible with AWS S3 API. With AWS Signature v2 / v4 it comes with variety of options for developers and architects including SDK, CLI, web UI to access user’s stored content. It fits very well for development and testing purposes, at the same time great thing to deliver static assets (like images, videos and documents) and is great for defining backup targets (repositories). With data protection functionality (against hardware failures using erasure code and bitrot detection), highly available nodes (in distributed configurations) and some other nice features it is still super quick and easy to setup and start using. In this article the major accent is made on backup destination configurations, where we want Minio act as our backup storage backend. Another awesome fact about Minio, - it is open source (eventually it is free product) and can be deployed through really impressive list of options (docker, native app for NAS, compiled binaries for Linux, Mac or Windows, can be scaled with Kubernetes or implemented with docker using popular cloud vendors as docker-machine provider (i.e. Google Cloud Platform or Microsoft Azure). With that said, we found a combination of Minio stretch across available disk facilities in conjunction with CloudBerry utilities really handy and easy to use solution for backup offerings (public and private access are written below).
High level overview of our future configuration #
Going to set up fully secured with SSL and firewall Minio directing it to my available disk capacities (disk and NAS over NFS/SMB). I have good old Dlink DNS-325 with two SATA disks (1TiB and 1.5TiB) on board configured in RAID 0 in order to get better performance (since data is stretch across both disks on IO operations). I think it is no longer available for sales, so you shouldn’t worry about the same model since we are walking about NAS in context of it’s shares (they are similar across all devices). I could go ahead and configure JBOD or RAID 1, but my main argument is better performance on writes / reads to the NAS. I am skeptical, that all my public backups (over internet) can reach my disk IOPs and make them primary bottleneck. I am sure with 100 mbps network I won’t be seeing thin in my local network backup even in wireless. Some of my benchmark tests published in the table down below.
Build infrastructure for S3 #
I am going to use Debian 9 in order to run my Minio. Of course I can do it directly on NAS (I wish I could do this, but it is only in very custom mode and I couldn’t make apt to work in order to put some dependencies, moreover my NAS is not 64-bit, so compiled binaries were not possible to use). Of course it is cumbersome to find out place where OS can be install (especially if you do this at home), but since we are build service provider I assume we have at least one server. So we need server. Either Virtual Machine in large node of hypervisors or desktop thing like Workstation or Virtualbox does not matter where you are going to install it (for testing purposes will work, but performance might be different since more layers of virtualization abstractions etc).
Installing Linux #
Go ahead and grab network installer for Debian 9 here. I used NetInstaller amd64, since it is small size and let me download from repositories what I need for my future system. Since this is ISO and it is bootable, you know what to do next. I am going to skip the part of Debian installation (the only thing I want to highlight is that we don’t need anything except Standard System and SSH server (please go ahead and untick Desktop Environment, Print Server etc unless you use this server / plan to use for something from that list or you have phobia to work in dark CLI without GUI :). We want to keep our server light and lean and design only for data processing.
Install Debian and boot normally. Login to shell.
Getting dependencies and making first configurations #
Let’s go ahead update and install the following dependencies:
1sudo apt-get update
2sudo apt-get install curl nfs-common net-utils net-tools dnsutils samba smbclient htop vim -y
Since we want to mount external shares to our debian we need nfs-common or/and smbclient with samba. Other items we want for better CLI experience (text editor and network tools). Once install let’s go ahead and download minio binaries for our Linux environment and make it executable.
Mounting file systems for future backups and files #
There are variety of options how to do this, but I want to walk through few the most popular options (and I guess they are pretty good from the performance standpoint and my benchmark tests for different files can prove this). Let’s dive into those options.
Write operations #
Every test was repeated three times. The results were rounded up or down.
Files | NFS (write) | SMB (write) | NFS avg. | SMB avg. | ||||
---|---|---|---|---|---|---|---|---|
10 KiB (6998 files) | 38s | 37s | 37s | 95s | 106s | 102s | 37s | 101s |
1 MiB (240 files) | 24s | 23s | 23s | 26s | 29s | 27s | 23s | 27s |
500 MiB (1 file) | 46s | 45s | 45s | 45s | 45s | 45s | 45s | 45s |
3,5 GiB (1 file) | 323s | 323s | 324s | 325s | 324s | 323s | 323s | 324s |
Write example: (time cp -f ~/tmp/test/10KB/* /mnt/smb/test/10KB/) && (rm -f /mnt/smb/test/10KB/*)
After each read test the local cache must be cleared. Otherwise the measurement will be wrong!
Read operations #
Files | NFS (write) | SMB (write) | NFS avg. | SMB avg. | ||||
---|---|---|---|---|---|---|---|---|
10 KiB (6998 files) | 25s | 26s | 26s | 60s | 57s | 57s | 26s | 58s |
1 MiB (240 files) | 24s | 24s | 25s | 28s | 29s | 27s | 24s | 28s |
500 MiB (1 file) | 45s | 45s | 45s | 48s | 50s | 48s | 45s | 48s |
3,5 GiB (1 file) | 323s | 323s | 345s | 345s | 349s | 346s | 330s | 347s |
Read example: (time cp -f /mnt/nfs/test/1MB/* ~/tmp/test/1MB/) && (rm ~/tmp/test/1MB/* && sudo sh -c 'echo 1 >/proc/sys/vm/drop_caches' && sudo sh -c 'echo 2 >/proc/sys/vm/drop_caches' && sudo sh -c 'echo 3 >/proc/sys/vm/drop_caches')
Network File System (NFS) #
Since my NAS (as well as thousand other NAS boxes) supports NFS I am going to enable this on it and mount to my Debian box. I won’t describe how to enable NFS service on the NAS since it is super simple and can be done in the administrator’s interface of the device (check network configuration / network management section of it).
Assuming NFS is enabled, let’s go ahead and mount NFS as file system to our Linux machine.
Let’s go ahead and list exported shares from the remote device:
1showmount -e IP_OF_YOUR_DEVICE_WITH_NFS
2Export list for IP_OF_YOUR_DEVICE_WITH_NFS:
3/volume1/share *
The above means you have share available under the path: IP_OF_YOUR_DEVICE_WITH_NFS/volume1/share
Let’s go ahead and mount this share to our Debian:
1mkdir /mnt/nas
2mount IP_OF_YOUR_DEVICE_WITH_NFS:/volume1/share /mnt/nas/
This simply mounts your share, but the issue is that if you reboot your system you have to do this again, since it is not automatically mounted. In order to have your NFS system mounted every time you reboot system we need to use fstab (static information about the filesystems). Go ahead and add the following line to the bottom of your /etc/fstab
(sudo vim /etc/fstab
). I love vim
even if I had hard times to quit it hahaha! You can use nano or whatever text editor you like. Now drop this to the bottom of your file and make changes accordingly in it:
1IP_OF_YOUR_DEVICE_WITH_NFS:/volume1/share /mnt/nas nfs rw,hard,intr,nolock 0 0
Hit “esc” few times and then type “wq!” (w stands for “write”, q stands for quit), if you are seeing vim
for the first time. Let’s mount it!
1sudo mount -a
If all were done right you should be able to list your share by typing:
1df
2Filesystem 1K-blocks Used Available Use% Mounted on
3IP_OF_YOUR_DEVICE_WITH_NFS:/volume1/share 956689920 54299008 902288512 6% /mnt/nas
Verify yours by and check if you have existing content on network share available for you by listing files / folders ls /mnt/nas
. Make sure you can create files / folders (mkdir foldername
and touch filename
in the share).
If something does not work, do sudo umount -a
and change things accordingly. If you have problem mounting your shares, don’t hesitate to share your experience with me down below in comment box. I will be happy to help!
SMB shares #
This is another option we can consider for backup targets. Since SMB has progressed pretty well and comes with SMB2/3 (3.x) support implementations, it is not too bad option for the backup file destination as well (similar to NFS). I’ve done some reserch earlier and going to share my benchmark tests below in table. I am going to skip details of this part since it is the same as NFS. In this article you can see some CLI details on how to mount your SMB share (this article is about re-exporting share).
SSL cert and default configuration #
Download minio binary file for Debian (available for other operating systems, check github of Minio).
1wget https://dl.minio.io/server/minio/release/linux-amd64/minio
2sudo chmod +x minio
3sudo mv minio /usr/local/bin
Since we want to follow best practice we don’t want to use root for our minio service, instead we are going to create separate user.
1sudo useradd minio
2sudo chown minio.minio /usr/local/bin/minio
Let’s create default settings.
1sudo mkdir /etc/minio
2sudo chown minio:minio /etc/minio
Let’s create environment variables in the /etc/default/minio
file:
1MINIO_VOLUMES="/mnt/nas"
2MINIO_OPTS="-C /etc/minio --address :443"
For the simplicity sake I keep my server on IP and want to generate self-signed certificate for this. In order to do so let’s jump into our Debian CLI and create new file touch openssl.conf with the following content:
1[req]
2distinguished_name = req_distinguished_name
3x509_extensions = v3_req
4prompt = no
5
6[req_distinguished_name]
7C = US
8ST = VA
9L = Somewhere
10O = MyOrg
11OU = MyOU
12CN = MyServerName
13
14[v3_req]
15subjectAltName = @alt_names
16
17[alt_names]
18IP.1 = 127.0.0.1
Change C, ST, L, O, OU, CN to define your items (see list below respectfully):
- countryName
- stateOrProvinceName
- localityName
- organizationName
- organizationalUnitName
- commonName
Save your file and let’s use openssl
tool to generate our certificate:
1openssl req -x509 -nodes -days 730 -newkey rsa:2048 -keyout private.key -out public.crt -config openssl.conf
Now make sure to copy both files to your minio’s cert folder (/etc/minio/certs
):
1mv private.key public.crt /etc/minio/certs
Let’s do dry-run issuing the following (change directory to the `/usr/local/bin/ since you have binaries of Minio there copied at the very first stage):
1./minio server -C /etc/minio/ --address “:443” /mnt/nas
2If you see the following output in your CLI you are well set:
3Drive Capacity: 1.3 TiB Free, 1.3 TiB Total
4Endpoint: https://YOUR_IP:443 https://127.0.0.1:443
5AccessKey: 77ADLR1A59PH9GZ1U1KW
6SecretKey: s6McIctVHdJQMJB7IDZ56An7nhLF+xybd6JA9+vd
7Browser Access:
8 https://YOUR_IP:443 https://127.0.0.1:443
9Command-line Access: https://docs.minio.io/docs/minio-client-quickstart-guide
10 $ mc config host add myminio https://YOUR_IP:9000 77ADLR1A59PH9GZ1U1KW s6McIctVHdJQMJB7IDZ56An7nhLF+xybd6JA9+vd
11Object API (Amazon S3 compatible):
12 Go: https://docs.minio.io/docs/golang-client-quickstart-guide
13 Java: https://docs.minio.io/docs/java-client-quickstart-guide
14 Python: https://docs.minio.io/docs/python-client-quickstart-guide
15 JavaScript: https://docs.minio.io/docs/javascript-client-quickstart-guide
16 .NET: https://docs.minio.io/docs/dotnet-client-quickstart-guide
You can check minio in web browser as well since it is awesome!
Ctrl+C
should stop and exit from this service.
Since we have port 443
binded for Minio server, we can go ahead and do port forwarding on our Internet router. This will help us to access our Minio server from the outside (public network from the closest Starbucks or your home network). We don’t stop on details of this configuration since it is super simple and you may have other routing device than mine. Just check “Port forwarding” section or similar name in admin’s portal of your device.
Autostart minio on reboot #
This is great set of init/system scripts for Linux systems for Minio (NB: this is forked version of minio-service repository and contains some custom changes). Let’s grab service for Debian then.
1wget https://github.com/minio/minio-service/blob/master/linux-systemd/minio.service
Go ahead and make changes in minio.service
file (you may want to change user and group) and working directory as well as environment file. Once done, save and move this file to systemd and enable service.
1sudo mv minio.service /etc/systemd/system
2sudo systemctl daemon-reload
3sudo systemctl enable minio
The service can be managed by the following:
1sudo systemctl start minio
2sudo systemctl status minio
Full list of actions: start, stop, restart, status.
If you see green indication, means all set, otherwise troubleshoot it according to what it tells.
Check if service runs by ps aux | grep minio
(you should see service name and it’s PID
as well as some other system details describing running service). You can do kill -9 PID
to kill the service and you will see it starts again with another PID (because we added minio.service to our systemctl) or you can stop the service using systemctl service (sudo systemctl stop minio
). Awesome!
You are ready for the next phase.
Backup and file management tools for S3. #
I made this for my backup infrastructure. So I want to store my customers backup data (as service provider) in my storage facilities (alternatively you can use S3 or Backblaze B2, but this is absolutely different story). Moreover with the later minor update of MBS platform I can ignore self-signed certs and do not use validation for endpoint along with pair of credentials (for minio). This helps to pass down access details to minio server to the agent without backend validation. Eventually this option allows me to define private IPs for endpoints for my backups (obviously this gives me better performance experience, but it should be local network or VPN between computer with CB Backup agent and my Minio).
Since we have access and secret keys from the step above as well as private and public IPs for our Minio server let’s walk through configuration of each CloudBerry tool.
CloudBerry Backup #
Since CloudBerry offers two product lines (standalone and MBS) we would quickly walk through each version below. Let’s start with MBS version.
In MBS version of the product we do storage configuration in it’s web backend which is linked to the user we use for CB Agent. Configued credetials are passed down to the agent from backed. So the main difference between two (from storage configuration) is the place where admin does this.
Let’s make sure we have IPs (both public and private), access and secret keys for our Minio. Make sure it is available, resolved using both network (so we can configure public and private links). Assuming you have MBS profile (if you you can sign up for free). Select Storage → Storage accounts and add new storage. Make sure you keep “Ignore certificate” since we are using self-signed (if you have fully valid signed and issued by CA, for instance Certbot, you keep default unticked state of this option). We want “Don’t check credentials” for private Minio (in case we want to pass down our set of credentials to the user agent).
And now private configuration example (“Do not check credentials (no public access)” has been ticked).
When you are done with storage configuration we need to associate storage account with our user.
This user is all set and ready to send backups to the Minio storage. Repeat the same steps with public IP (if you did for private) or vice-versa in case you want both options (private and public) and you did port forwarding configuration of your routing device.
Standalone backup #
Let’s see how it is different in standalone product line. Storage configuration is done on agent (that’s the main difference at least for our case). Let’s get the product from here (link to product) and configure using our credentials.
CloudBerry Drive #
I’ve written earlier when blogging about call center on open source Asterisk IP/PBX with requirements to keep call records in S3 with further storage class change from S3-Standard to S3-IA and later - S3-Glacier for less bills and better calls review experience (since we can re-play them directly from mounted S3 bucket to our Windows).
This time we want to map (as network share or removable device) our Minio server to Windows environment (since Drive is available for Windows OS only, check system requirements here). This will help us to add custom (other than backup) data to our minio server. We can keep it in the single bucket or create separate (by they way, any file is kept in a bucket).
CloudBerry Explorer #
I always say that Explorer is really essential tool when you start working with object storage. We are lucky in case of Minio since it comes with web UI, but I still want to show how to use minio in Explorer.
Resources #
- https://www.digitalocean.com/community/tutorials/how-to-set-up-an-object-storage-server-using-minio-on-ubuntu-16-04
- https://blog.varonis.com/the-difference-between-cifs-and-smb/
- http://blog.fosketts.net/2012/02/16/cifs-smb/
- https://ferhatakgun.com/network-share-performance-differences-between-nfs-smb/