Migrating from Paperclip to Active Storage in Rails 6.1.3

I’ve spent the past month or so working on moving my company’s application from using Paperclip for our attachments over to Active Storage. I found a lot of articles that gave a very rosey view of how to do this but none that quite hit the spot for me. We have millions of attachments in our production S3 so things need to be done right.

These are the main problems I encountered and how I solved them.

  • Migration Strategy
  • Files saved in S3 root by default
  • Variants
  • Purging
  • Testing against LocalDisk setup

For some background, we have multiple models with single attachments, both images and documents, some of these are validated, some aren’t. In Paperclip, when an image is uploaded, it is resized to multiple standardised sizes and stored in an S3 folder. The app has functionality to add, change, upload and download files.

A quick summary of why I think Active Storage is better:

  • Paperclip is deprecated
  • Active Storage is built into Rails, it feels like the natural choice
  • It uses new tables created through migrations rather than having to change model tables that already exist
  • Active Storage can create resized images on the fly, but also now stores these in a table (as of 6.1) so performance on fetching these from S3 is greatly improved (more later)

There have admittedly been downsides, the greatest ones for me being:

  • Active Storage is very opinionated
  • You still have to work to get the files to save where you want in S3
  • Active Storage doesn’t slot as easily into your code as Paperclip did (this is probably my fault for wanting the custom keys)
  • Files don’t always want to purge

Finding a Migration Strategy and Script

The article that got my closest to the finished result was this one, it was almost spot on and I’m very grateful it exists; it gave me a way to copy all my existing attachments to my new active storage path in S3, as well as adding a concern to add to my models so new attachments created in the migration period would also be added to S3 in the correct place. A few improvements I made were to:

  • batch the rake task
  • move the delete check higher to save work
  • add a check whether the attachment already has an attached, and if so purge this first to avoid orphaned files in S3
  • change the blob created to use a custom key field, so the files aren’t created in the root of the S3 folder
  • created a CSV file for each model, saving details on record id, record type, paperclip S3 path and Active Storage S3 path to troubleshoot any issues

I stuck to the two PR approach suggested in the link, so when this was done, I then moved to the second PR which starts using the Active Storage S3 attachments I have just created.

S3 Saving in Root Directory

A big no for me was Active Storage’s love of saving all files in the root folder of your S3 bucket with only a random mix of letters and numbers for each file’s name. People have come up for workarounds, but thankfully in Active Storage 6.1, the ability to specify the key, which sets the save path in the S3 bucket was added! (Queue a montage of creating and implementing a card to upgrade our Rails version from Rails 6 to Rails 6.1)

Unfortunately, if you’re moving over from Paperclip this won’t naturally slot in. The Paperclip approach to attach a photo to an object is something link:

photo_file = File.open(photo_path)
person.photo = photo_file
object.save!

If I were to use this with Active Storage, I would get my file sat in the root of S3 — the rosey picture sold by others as no work, but not what I was looking for. To specify a custom key, this looks like (content-type is optional):

person.photo.attach(
key: s3/path/ActiveStorage::Blob::generate_unique_secure_token
io: File.open(photo_path)
filename: photo.jpg
)

I used the Blob’s random name creation to stop users being able to predict URLs in my S3. As I ended up with blocks like this littered throughout the code, I moved this into a concern and applied it to all models with an attachment. Annoyingly, this meant I had to implement it as an attach method on the model instance rather than the attachment variable. Here I would call :

person.attach(File.open(photo_path))

My Object model would have an include ActiveStorageAttachments statement, with the below a simplified example of the concern. As models have different attachment names in my app, I had to do a bit of trickery for this to work; I did this in the attachment_type method. Obviously this only works if each model has one attachment, and I warrant there’s probably much nicer ways with Rails magic.

module ActiveStorageAttachments
extend ActiveSupport::Concern

def attach(file)
return if file.nil?
attach_file(file, attachment_type)
end

Active Storage Variants

Active Storage offers variants enable you to transform your image, for example to create a thumbnail. These are created on the fly by downloading your original image, resizing it and then uploading the created variant in S3.

person.photo.variant(resize: "300x300")

In previous versions, when you want to get this variant from S3, it would do a call to S3 to find out if it already has the variant, then either does another call to fetch the image, or has to go through the process of downloading the original to create the variant.

In Rails 6.1, a new table migration was introduced for Active Storage. This stores what variants are available for each image, greatly reducing the number of calls that need to make to S3. You can enable this tracking by adding the following line to your environment config:

config.active_storage.track_variants: true

When a new variant record is created, a row will be added into the active_storage_variant_records, active_storage_attachments and active_storage_blobs tables. The variant record contains an encoded version of the variant transformation and the blob_id of the variant, linking to the blob table where you can find the key of the variant. The entry in the attachment table has a record_id which is the blob_id of the image it is a variant of.

This is such an awesome change for 6.1, as you can tell by all the excited comments on the PR. I didn’t hit many stumbling blocks implementing it, apart from the purging issue below, but there isn’t much to find on the internet about it yet.

Purging

When purging existing records, I found issues getting rid of the variants in my S3 bucket. By looking at the Active Storage logs, I was able to find a way around this.

By default, i.e. if you don’t mess with keys on your blob and don’t have track_variants set to true, Active Storage will save variants in your S3 bucket in the path:

variant/<original_image_key>/<random_file_name>

For some unknown reason, if track variants is true, it just sticks them in the root of your S3 bucket again. So, for this one I had to override the Active Storage method that writes the variants. This is found in variant_with_record.rb. In the create_or_find_record method, you’ll see a basic attach is used. To overcome this, I copied the contents to a new file, config/initializers/variant_with_record.rb and changed the method to the following:

def create_or_find_record(image:)
@record =
ActiveRecord::Base.connected_to(role: ActiveRecord::Base.writing_role) do
blob.variant_records.create_or_find_by!(variation_digest: variation.digest) do |record|
record.image.attach(
io: image[:io],
key: "variants/" + self.blob.key + "/" + ActiveStorage::Blob::generate_unique_secure_token,
filename: image[:filename]
)
end
end
end

This now saves in the default place, meaning the variant files are picked up by the purge job:

S3 Storage (166.0ms) Deleted file from key: s3/path/qopxrg1ty8voyw4zvg9ru16dj666
S3 Storage (123.6ms) Deleted files by key prefix: variants/s3/path/qopxrg1ty8voyw4zvg9ru16dj666/

Sadly I’ve yet to find a good solution for removing the variant records from the blob table, but at least they’re removed from S3 — I’ll update if I find a good way.

Testing against LocalDisk Setup

My final hurdle was getting all of the tests to work. We have a lot of a tests, which is awesome, and hundreds were now failing, not so awesome. Thankfully most of these were just semantics changes between Paperclip and ActiveStorage, e.g. in Paperclip if an attribute doesn’t have an attachment it is nil, so I could do:

expect(person.photo.nil?).to be true

In ActiveStorage, regardless of if there is an attachment it is a ActiveStorage::Attached::One, and you need to use attached? to check if it has an attachment.

expect(person.photo.attached?).to be true

Simple enough.

The biggest problem was that all of my work revolved around S3. But I didn’t want my local test environment to be wasting S3 storage space with test files, the more developers you have, the quicker that will become an expensive problem, so in my storage.yml I added a test environment config:

test:
service: Disk
root: <%= Rails.root.join("tmp/storage") %>

The Disk config works fine until your tests try to use code that is written with the ability to use S3. For example to get the path off an attachment, you do object.attachment.url (or service_url). This will throw a nil error, as the test object attachment doesn’t have a url.

To fix this, I added a new method to my ActiveStorageAttachment concern, that handled giving me back the right object depending on my environment. I then replaced all instances of object.attachment.url with object.get_attachment_url.

def get_attachment_url
if Rails.env.test?
return ActiveStorage::Blob.service.path_for(attachment_type.key)
end
Rails.application.routes.url_helpers.rails_blob_url(attachment_type)
end

ActiveStorage::Blob.service.path_for returns the full path for the file stored on your local disk, using the attachment key. Rails_blob_url returns the Active Storage URL which will redirect to the actual S3 endpoint when called.

That’s a quick summary of the fun I’ve had during this migration, hopefully some of this will help others reduce their pain a little.

Developer at Nourish Care

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store