The ELC Community Blog
A knowledge exchange on Ruby on Rails and Agile Development
AWS-S3 gem extensions and Amazon's Copy API
by Alex Chee on June 04, 2008
One of our projects needed to copy lots of files between different S3 buckets, and Amazon just came out with their beta version for Copying s3 objects. So, we decided it would be handy to use this new feature. Instead of downloading each file then uploading it back to S3, which was the only official way to do this before this feature came out.
We also found that the gem did not include an argument to copy/rename objects between different buckets. So we make a patch to the s3 gem to use the new Copy API and accept an extra argument for the destination bucket. We found it more useful, for us, to have this ability.
copy_patch.diff: 1 Index: lib/aws/s3/object.rb
2 ===================================================================
3 --- lib/aws/s3/object.rb (revision 1282)
4 +++ lib/aws/s3/object.rb (working copy)
5 @@ -178,19 +178,19 @@
6 end
7 end
8
9 - # Makes a copy of the object with <tt>key</tt> to <tt>copy_name</tt>.
10 - def copy(key, copy_key, bucket = nil, options = {})
11 - bucket = bucket_name(bucket)
12 - original = open(url_for(key, bucket))
13 + # Makes a copy of the object with <tt>key</tt> in bucket <tt>src_bucket</tt> to <tt>copy_name</tt> in bucket <tt>dest_bucket</tt>.
14 + def copy(key, copy_key, src_bucket = nil, dest_bucket = nil, options = {})
15 + src_bucket = bucket_name(src_bucket)
16 + dest_bucket = bucket_name(dest_bucket)
17 + original = open(url_for(key, src_bucket))
18 default_options = {:content_type => original.content_type}
19 - store(copy_key, original, bucket, default_options.merge(options))
20 - acl(copy_key, bucket, acl(key, bucket))
21 + copy(key, copy_key, src_bucket, dest_bucket, options)
22 end
23
24 - # Rename the object with key <tt>from</tt> to have key in <tt>to</tt>.
25 - def rename(from, to, bucket = nil, options = {})
26 - copy(from, to, bucket, options)
27 - delete(from, bucket)
28 + # Rename the object with key <tt>from</tt> in bucket <tt>src_bucket</tt> to have a key in <tt>to</tt> in bucket <tt>dest_bucket</tt>.
29 + def rename(from, to, src_bucket = nil, dest_bucket = nil, options = {})
30 + copy(from, to, src_bucket, dest_bucket, options)
31 + delete(from, src_bucket)
32 end
33
34 # Fetch information about the object with <tt>key</tt> from <tt>bucket</tt>. Information includes content type, content length,
35 @@ -238,8 +238,35 @@
36
37 put(path, options, data) # Don't call .success? on response. We want to get the etag.
38 end
39 +
40 +
41 + # Copies an object from <tt>source_key</tt> and <tt>source_bucket</tt> to <tt>dest_key</tt> and <tt>dest_bucket</tt>
42 + def copy(source_key, dest_key, source_bucket = nil, dest_bucket = nil, options = {})
43 + validate_key!(dest_key)
44 + # Must build path before infering content type in case bucket is being used for options
45 + path1 = path!(dest_bucket, dest_key, options)
46 + path2 = path!(source_bucket, source_key, options)
47 + infer_content_type!(dest_key, options)
48 + options['x-amz-copy-source'] = path2
49 + options['x-amz-metadata-directive'] = 'COPY'
50 + put(path1, options) # Don't call .success? on response. We want to get the etag.
51 + end
52 +
53 alias_method :create, :store
54 alias_method :save, :store
55 +
56 +
57 + def copy(source_key, dest_key, source_bucket = nil, dest_bucket = nil, options = {})
58 + validate_key!(dest_key)
59 + # Must build path before infering content type in case bucket is being used for options
60 + path1 = path!(dest_bucket, dest_key, options)
61 + path2 = path!(source_bucket, source_key, options)
62 + infer_content_type!(dest_key, options)
63 + options['x-amz-copy-source'] = path2
64 + options['x-amz-metadata-directive'] = 'COPY'
65 + put(path1, options) # Don't call .success? on response. We want to get the etag.
66 + end
67 +
68
69 # All private objects are accessible via an authenticated GET request to the S3 servers. You can generate an
70 # authenticated url for an object like this:
Just run this patch in your gem directory and change all references in to copy and rename to include the destination bucket in the arguments. I suggest freezing your gem and executing the patch in the vendor/gems/aws-s3 directory, so you would not be changing your gem for all your previous projects and break them.
Since you're already modifying your aws-s3 gem, it might be worthwhile to also add an Expires and Cache-Control Header to your static assets (images, javascripts, and css). This will make the browser cache files for 3 years (don't worry, if you change the file, S3 will still update the cache-control header) and make YSlow happy.
Timeline
- Liquid Coolness
- Ehcache for JRuby / Rails
- JS Routes plugin
- AWS-S3 gem extensions and Amazon's Copy API
- Warble with Console
- Speedy Solr: XML Libraries
- ImageVoodoo File Extensions
- Dataportability: XRDS-Simple
- Defensio Lite
Comments
Very nice, are you planning to submit this as a patch to Marcel’s main git branch?
It looks as though Marcel has modified the S3Object.copy method to replace the ad-hoc with the official API call on June 6, 2008 (2 days after this post). See http://github.com/marcel/aws-s3/commit/db13e585c6008fa3c7ec5f710bde936945e6570b for details.
It is interesting, however, that his modification doesn’t appear to allow copying between buckets, but only to new objects within the same.