There are 3 ways to do eager loading (preloading) in Rails:
  • includes()
  • preload()
  • eager_load()
- includes() delegates the job to preload() or eager_load() depending on the presence or absence of condition related to one of the preloaded table.
- preload() is using separate DB queries to get the data.
- eager_load() is using one big query with LEFT JOIN for each eager loaded table.
You are probably already familiar with the method includes() for eager loading data from database if you are using Rails and ActiveRecord. But do you know why you sometimes get few small and nice SQL queries and sometimes one giant query with every table and column renamed? And do you know about preload() and eager_load() which can help you achieve the same goal?
Let's start with our Active Record class and associations definitions that we are going to use throughout the whole post:
class User < ActiveRecord::Base
  has_many :addresses
end

class Address < ActiveRecord::Base
  belongs_to :user
end

And here is the seed data that will help us check the results of our queries:

anshul = User.create!(name: "Anshul Kumar", email: "anshul@gmail.com")
aashu = User.create!(name: "Aashu", email: "aashu@gmail.com")

anshul.addresses.create!(country: "India", city: "New Delhi")
aashu.addresses.create!(country: "India", city: "Moradabad")

Typically, when you want to use the eager loading feature you would use the includes() method, so what are those two other methods for? First let's see all of them in action.

Rails 3:

User.includes(:addresses)
#  SELECT "users".* FROM "users" 
#  SELECT "addresses".* FROM "addresses" WHERE "addresses"."user_id" IN (1, 2)

User.preload(:addresses)
#  SELECT "users".* FROM "users" 
#  SELECT "addresses".* FROM "addresses" WHERE "addresses"."user_id" IN (1, 2)
User.eager_load(:addresses)
#  SELECT"users"."id" AS t0_r0, "users"."name" AS t0_r1, "users"."email" AS t0_r2, "users"."created_at" AS t0_r3, "users"."updated_at" AS t0_r4, "addresses"."id" AS t1_r0, "addresses"."user_id" AS t1_r1, "addresses"."country" AS t1_r2, "addresses"."city" AS t1_r3, "addresses"."created_at" AS t1_r4, "addresses"."updated_at" AS t1_r5FROM "users" LEFT OUTER JOIN "addresses" ON "addresses"."user_id" = "users"."id"

If you use preload(), it means you always want separate queries. If you use eager_load() you are doing one query. So what is includes() for? It decides for you which way it is going to be. You let Rails handle that decision. And the decision is based on query conditions. Let's see an example where includes() delegates to eager_load() so that there is one big query only.

User.includes(:addresses).where("addresses.country = ?", "New Delhi")
User.eager_load(:addresses).where("addresses.country = ?", "New Delhi")

# SELECT "users"."id" AS t0_r0, "users"."name" AS t0_r1, "users"."email" AS t0_r2, "users"."created_at" AS t0_r3, "users"."updated_at" AS t0_r4,"addresses"."id" AS t1_r0, "addresses"."user_id" AS t1_r1, "addresses"."country" AS t1_r2, "addresses"."city" AS t1_r3, "addresses"."created_at" AS t1_r4, "addresses"."updated_at" AS t1_r5 FROM "users"LEFT OUTER JOIN "addresses" ON "addresses"."user_id" = "users"."id" WHERE (addresses.country = 'New Delhi')

In the last example Rails detected that the condition in where clause is using columns from preloaded (included) table names. So includes() delegates the job to eager_load(). You can always achieve the same result by using the eager_load() method directly.
What happens if you try to use preload() in query based condition?

User.preload(:addresses).where("addresses.country = ?", "New Delhi")
#  SELECT "users".* FROM "users" WHERE (addresses.country = 'New Delhi')
#
#  SQLite3::SQLException: no such column: addresses.country

We get an exception because we haven't joined users table with addresses table in any way.

For using preload() with condition we have to join users table with addresses table:
User.joins(:addresses).where("addresses.country = ?", "New Delhi").preload(:addresses)
# SELECT "users".* FROM "users" INNER JOIN "addresses" ON "addresses"."user_id" = "users"."id" WHERE (addresses.country = 'New Delhi')
# SELECT "addresses".* FROM "addresses" WHERE "addresses"."user_id" IN (1,2)

Rails 4:

In Rails 4 everything is same except you should use references() combined with includes() if you have the additional condition for one of the eager loaded table.

User.includes(:addresses).where("addresses.country = ?", "Moradabad")

#DEPRECATION WARNING: It looks like you are eager loading table(s)
# (one of: users, addresses) that are referenced in a string SQL
# snippet. For example: 
#
#    Post.includes(:comments).where("comments.title = 'foo'")
#
# Currently, Active Record recognizes the table in the string, and knows
# to JOIN the comments table to the query, rather than loading comments
# in a separate query. However, doing this without writing a full-blown
# SQL parser is inherently flawed. Since we don't want to write an SQL
# parser, we are removing this functionality. From now on, you must explicitly
# tell Active Record when you are referencing a table from a string:
#
#   Post.includes(:comments).where("comments.title = 'foo'").references(:comments)
# 
# If you don't rely on implicit join references you can disable the
# feature entirely by setting `config.active_record.disable_implicit_join_references = true`. (
# SELECT "users"."id" AS t0_r0, "users"."name" AS t0_r1, "users"."email" AS t0_r2, "users"."created_at" AS t0_r3, "users"."updated_at" AS t0_r4,"addresses"."id" AS t1_r0, "addresses"."user_id" AS t1_r1, "addresses"."country" AS t1_r2, "addresses"."city" AS t1_r3, "addresses"."created_at" AS t1_r4, "addresses"."updated_at" AS t1_r5 FROM "users"LEFT OUTER JOIN "addresses" ON "addresses"."user_id" = "users"."id" WHERE (addresses.country = 'Moradabad')
In other words, because Rails does not want to be super smart anymore and spy on our where conditions to detect which algorithm to use, it expects our help. We must tell it that there is condition for one of the tables.

User.includes(:addresses).where("addresses.country = ?", "Moradabad").references(:addresses)
# SELECT "users"."id" AS t0_r0, "users"."name" AS t0_r1, "users"."email" AS t0_r2, "users"."created_at" AS t0_r3, "users"."updated_at" AS t0_r4,"addresses"."id" AS t1_r0, "addresses"."user_id" AS t1_r1, "addresses"."country" AS t1_r2, "addresses"."city" AS t1_r3, "addresses"."created_at" AS t1_r4, "addresses"."updated_at" AS t1_r5 FROM "users"LEFT OUTER JOIN "addresses" ON "addresses"."user_id" = "users"."id" WHERE (addresses.country = 'Moradabad')
Eager Loading

2 thoughts on “Eager Loading

  • September 25, 2014 at 10:17 am
    Permalink

    Drupal is one of these effective software packages that helps people and businesses publish content
    on their websites. One should keep in mind that
    communication lines for live support are kept open for paid accounts, who are given top priority.
    The amount of space required by a website should also be considered
    while choosing a web host.

    Reply
  • March 8, 2015 at 3:15 am
    Permalink

    Hi blogger, i found this post on 19 spot in google’s search results.
    I’m sure that your low rankings are caused by high bounce rate.
    This is very important ranking factor. One of the biggest reason for high bounce rate
    is due to visitors hitting the back button. The higher your bounce
    rate the further down the search results your posts
    and pages will end up, so having reasonably low bounce rate is important for improving your rankings naturally.
    There is very useful wordpress plugin which can help you.

    Just search in google for:
    Seyiny’s Bounce Plugin

    Reply

Leave a Reply

Your email address will not be published.