How to save Keras models directly to AWS S3

How to make sharing model files among data science teams simpler

Photo by Chris Ried on Unsplash

Keras is a very popular framework developed by Google for training and using machine learning models, and it has become somewhat ubiquitous in its use within the domain. In my work I am presented with problems related to building things that make machine learning and its related applications easy for data scientists to use, and one of those requirements was to make sharing model files easier among a data science team.

Serialized machine learning models are almost binary files, making them not very suitable to store and version control using conventional version control systems such as git. The solution for this is to put them in an object store such as AWS S3 where they can be stored, updated and used by different data scientists on the same team. However, Keras by default stores its models on a folder structure. Take the simple model:

which generates the following folder structure:

which is only a very simple example of the various folders that Keras could generate depending on the type of model you are creating. By giving the top-level folder name in the fashion of:

you would be able to load the model for use later.

Problem: Enable easy export of Keras models to S3 without needing to traverse through the generated folder structure in code, and enable easy fetching of a model exported in such a manner so that it can be immediately loaded by Keras.

Solution: Zip up the folder structure generated by Keras in a temporary folder. Upload the zipped file to S3. When loading a model, download the corresponding zip file from S3 in to a temporary folder, unzip it, and load it from there.

Gist for the complete code.

Let’s say we have a simple Keras model like what was outlined above:

We’re going to use Python’s tempfile library to save this model in a temporary location:

By using the temporary directory with context, with tempfile.TemporaryDirectory() , we ensure that the temporary directory is deleted and forgotten as soon as we leave that context block.

Next, we zip it up:

This uses a zipdir function which traverses the folder with the Keras model in it, and adds it to the given zip file:

Now, we can use an s3fs object to write the zipped file to the S3 bucket we need:

To get this file back and use it in Keras, we have a simple function that uses all the above libraries to reverse the process:

Put everything together, and we have a simple implementation of saving Keras models in their entirety to S3 and getting them back without having to think about traversing nested folder structures created when saving Keras models.

FOSS Mercenary. Guitarist. Eternal optimist.