UUIDs and Rails find_in_batches
UUIDs (Universally unique identifiers) are really neat as IDs, and they allow you to have the ID of a model before it is even saved and guarantee that it won’t be fail insertion due to the ID being taken already. They’re also a full class type in PostgreSQL which is even more badass because it will handle the storage part, meaning you don’t have the drag on performance that would come along if you had just placed the UUID in a varchar (or equivelent).
When you try to use UUIDs with Rails, things fall apart with
#find_in_batches
because its implementation abuses value comparisons
to get a performance benefit when paging through results (source).
We started using UUIDs on a few models, so I wrote a new version of
#find_in_batches
that can work with either type of ID (but does not support the :start
option).
This also fixes #find_each
because #find_each
uses #find_in_batches
under the covers.
I figured it’d be useful to someone eventually, so here’s the code:
in lib/clean_find_in_batches.rb
module CleanFindInBatches
def self.included(base)
base.class_eval do
alias :old_find_in_batches :find_in_batches
alias :find_in_batches :replacement_find_in_batches
end
end
# Override due to implementation of regular find_in_batches
# conflicting using UUIDs
def replacement_find_in_batches(options = {}, &block)
relation = self
return old_find_in_batches(options, &block) if relation.primary_key.is_a?(Arel::Attributes::Integer)
# Throw errors like the real thing
if (finder_options = options.except(:batch_size)).present?
raise "You can't specify an order, it's forced to be #{batch_order}" if options[:order].present?
raise "You can't specify a limit, it's forced to be the batch_size" if options[:limit].present?
raise 'You can\'t specify start, it\'s forced to be 0 because the ID is a string' if options.delete(:start)
relation = apply_finder_options(finder_options)
end
# Compute the batch size
batch_size = options.delete(:batch_size) || 1000
offset = 0
# Get the relation and keep going over it until there's nothing left
relation = relation.except(:order).order(batch_order).limit(batch_size)
while (results = relation.offset(offset).limit(batch_size).all).any?
block.call results
offset += batch_size
end
nil
end
end
and in config/initializers/clean_find_in_batches.rb
ActiveRecord::Batches.send(:include, CleanFindInBatches)