Using boto3 (Python) with R (reticulate) to push to S3
Using Python with R
The motivation for this comes from the fact that it is easier now more than ever to use Python with R via the reticulate
package.
It does make a lot of sense too! I rely on packages to upload client data to S3 buckets quite a bit using R scripts and RStudio Connect
for automation. Unfortunately, the aws.s3
r-package has been wonky and does not seem to be maintained.
Enter, boto3
- a Python package which lets you interface AWS S3. It is cleaner and easier to handle in terms of the set-up.
Example
You first set up your R-script like you usually would (assuming you have Python installed and set up) with the addition to the reticulate
library added.
library(tidyverse)
library(reticulate)
## Do stuff and write to file
path <- "./file_name.csv"
s3_path <- "directory/file_name.csv"
ACCESS_KEY <- "AWS-KEY"
SECRET_KEY <- "AWS-SECRET"
Once you’re done processing data, etc. and writing results to a file, you move on to the Python section - you might have even seen the function below in some shape or form before.
# import boto3
#
#
# def upload_to_aws(local_file, bucket, s3_file):
# s3 = boto3.client('s3', aws_access_key_id = r.ACCESS_KEY,
# aws_secret_access_key = r.SECRET_KEY)
#
# try:
# s3.upload_file(local_file, bucket, s3_file, ExtraArgs={'ACL':'bucket-owner-full-control'})
# print("Upload Successful")
# return True
# except FileNotFoundError:
# print("The file was not found")
# return False
# except NoCredentialsError:
# print("Credentials not available")
# return False
#
# uploaded = upload_to_aws(r.path, 'bucket_name', r.path_s3)
The beauty of this is that you can reference r-objects easily by prepending r.
, which allows you to not only feed your key/secret pair into the Python script but also the file paths for both the local file and the S3-path.
That’s it!!
Thumbnail copyright: RStudio