Data to S3 with command line and encryption

This article details how to archive folders with their contents, encrypt the archives, upload them to S3, and remove the local copies.  A typical usage might be to archive the version of a website that I have with twenty thousand files on a certain date before changing the folder layouts of images for ten thousand images.

This article details how to archive folders with their contents, encrypt the archives, upload them to S3, and remove the local copies.  A typical usage might be to archive the version of a website that I have with twenty thousand files on a certain date before changing the folder layouts of images for ten thousand images.

With about 345 GB of data in S3, I run into some specific situations that merit custom scripts to solve issues. One of these is that I like to create a snapshot of a folder at specific point in time before making adjustments to the scripts and software within. I also do this for documents and other folders so that a point-in-time backup exists. This way, if I decide I want a previous folder structure and content, I can just unzip the archive I created. The purpose of these backups is for a point-in-time previous version of a website or web-application in case I change my mind later on. They are rarely used. I need to copy a folder, and all of it’s contents recursively, and save the permissions, such as the execute permission on scripts. Then, to take that zip file and encrypt it using AES256 and upload it to an S3 bucket, and then delete the local zip file and encrypted file to free the disk space. The second component is a decryption script for when I download such a file and want to extract it.

There are some factors to consider, and one is storage costs. Encrypting the archives adds about 30% to the file size. S3 has an automatic transition to a lower cost storage tier available, so I set that on for the backup to use for the longer term storage.  The Standard Information Assurance tier is just over a penny per GB per month.  The second major consideration is security.  Amazon offers encryption of the S3 buckets and their contents using keys managed by Amazon. I turn this on.  My concern is hackers grabbing the data using stolen keys, and not avoiding the NSA or other government agencies.  If anyone will have Quantum computers, it will be the NSA, so if they want the data, they will get it.  With this in mind, I chose symmetric encryption and to use OpenSSL for the encryption.  GnuPG is a very popular and recommended application for this, and there are arguments against using OpenSSL for this purpose.  Ultimately it came down to the fact that the objective is using AES encryption with a password that can be passed in scripts rather than using public and private keys.  It may be a risk that Amazon’s encryption keys could be compromised allowing users to read the data. It is also likely that they have intrusion detection systems which would detect hackers attempting to brute force the passwords on the files within an S3 bucket.  It may also be that attackers downloaded all the files and attempt to brute force them locally.  I plan more details on this choice in a later blog post.

The choices are Zip file format so that the archives can be viewed on multiple platforms with ease, *nix file permission preservation on Linux, and AES encryption prior to upload, and decryption when the file is downloaded. To accomplish this, I created a folder in the home directory called Archives.  This Archives folder is where the zip files are created.  The next step is the creation of two scripts. I called these scrypts myencrypt and mydecrypt.  I store these in the home folder in a subdirectory called Scripts.  The next piece is two functions within the .bashrc file so that I can use the whole process at the command line.

myencrypt

#!/usr/bin/env bash

# $1 = input file
# $2 = output file
openssl enc -aes-256-cbc -salt -a -p -in $1 -out $1.encrypted -k “some-cool-password”

mydecrypt

#!/usr/bin/env bash

# $1 = input file
# $2 = output file
openssl enc -aes-256-cbc -a -d -p -in $1 -out $2 -k “some-cool-password”

bashrc, ltsarchive()

ltsarchive() { timestamp=$(date +”%Y-%m-%d-%H%M%p”) && read -p “Enter folder name: ” name && zip -rv9 ~/Archives/$name-$timestamp.zip $name && ~/Scripts/myencrypt ~/Archives/$name-$timestamp.zip && aws s3 cp ~/Archives/$name-$timestamp.zip.encrypted s3://bucket-name/long-term-archives/$name-$timestamp.zip.encrypted && rm -f ~/Archives/$name-$timestamp.zip && rm -f ~/Archives/$name-$timestamp.zip.encrypted; }

bashrc, ltcsdecrypt()

ltsdecrypt() { filepath=$(pwd) && read -p “Enter file name: ” name && newname=${name::-9} && ~/Scripts/mydecrypt $filepath/$name $filepath/$newname; }

For example, I would go into a directory and want to archive the Templates directory and all the contents.  I would type ltsarchive and then the shell would prompt me asking for the name of the folder.  I would then type Templates and the function runs and first zips the Templates folder and subcontents into ~/Archives/Templates-2018-07-23-2037PM.zip.  Then, those files are passed to OpenSSL which encrypts that file to Templates-2018-07-23-2037PM.zip.encrypted.  Then, that file uploads to the long term storage folder in the specified S3 bucket.  After that, the zip file and the .encrypted file are deleted from ~/Archives/.  Upon checking the S3 storage bucket, Templates-2018-07-23-2037PM.zip.encrypted appears.

Then, the files can be downloaded via the web browser or through the command line interface at a later time.  To decrypt Templates-2018-07-23-2037PM.zip.encrypted, we go into the directory at the shell and type ltsdecrypt.  It asks for the file name, so we type Templates-2018-07-23-2037PM.zip.encrypted.  It decryptes the file leaves Templates-2018-07-23-2037PM.zip in the same folder as the downloaded file.  These scripts can then be used in automation.

 

Leave a Reply