Understanding Bundler - To `bundle exec` or not? that is the question
We, Ruby developers, are used to running scripts or commands with the prefix bundle exec
, but sometimes it’s not needed, but sometimes it is, and when it’s not needed it still works just fine if we add it. So it may not be clear why we need to use it in some cases.
In this blogpost I’ll try to answer these questions with a little insight on what Bundler (and Ruby and Rubygems) do.
What does Bundler do?
We use Bundler for a few different things:
- Resolve dependencies and versions for all the gems required in a project
- Store the calculated versions in a file so all the developers have the same gem versions
- Make sure our Ruby code has access to those specific versions of the gems
- We can use it to know which gems have new versions that will still fulfill all the other gems’ version restrictions
I’m only going to talk about how Bundler makes sure our code uses specific versions of the gems.
The Problem
When we are writing a Ruby script, if we want to use code from another script, we would use something like require 'csv'
, and Ruby will try to find that in our system.
require
is a method defined in the Kernel module.There are more methods to require code (like
require_relative
or Rails’ autoloading and lazy loading mechanisms), but I am only going to focus on this one for simplicity
How it does that depends on what we are trying to require.
The $LOAD_PATH Global Variable
Ruby keeps track of an array with all the paths it knows where code should be. We have all seen this variable somewhere while coding, but it’s one of those things we just don’t want to touch because it can break something else.
If we print the content of this array, we can see a list of paths in our system:
#irb
2.6.6 :001 > pp $LOAD_PATH
["/Users/arielj/.rvm/rubies/ruby-2.6.6/lib/ruby/gems/2.6.0/gems/did_you_mean-1.3.0/lib",
"/Users/arielj/.rvm/rubies/ruby-2.6.6/lib/ruby/site_ruby/2.6.0",
"/Users/arielj/.rvm/rubies/ruby-2.6.6/lib/ruby/site_ruby/2.6.0/x86_64-darwin19",
"/Users/arielj/.rvm/rubies/ruby-2.6.6/lib/ruby/site_ruby",
"/Users/arielj/.rvm/rubies/ruby-2.6.6/lib/ruby/vendor_ruby/2.6.0",
"/Users/arielj/.rvm/rubies/ruby-2.6.6/lib/ruby/vendor_ruby/2.6.0/x86_64-darwin19",
"/Users/arielj/.rvm/rubies/ruby-2.6.6/lib/ruby/vendor_ruby",
"/Users/arielj/.rvm/rubies/ruby-2.6.6/lib/ruby/2.6.0",
"/Users/arielj/.rvm/rubies/ruby-2.6.6/lib/ruby/2.6.0/x86_64-darwin19"]
You can see, for example, that I’m running Ruby 2.6.6 and using RVM.
Requiring a Module from the Standard Library
When we require something like the csv
module, it is part of the standard library (i.e.: it comes with Ruby). In this case, we can go over all the paths listed in that array until we find a file named csv.rb
. If we go to /home/arielj/.rvm/rubies/ruby-2.6.6/lib/ruby/2.6.0
we indeed find it. Ruby does the same to find the script and then loads the module so we can use it.
When it can’t find a file matching the name we required, it will raise an error. We have probably all seen more of that than we want to:
LoadError (cannot load such file -- some_unknown_module)
Requiring a Gem
If we look back at the $LOAD_PATH
array we’ll notice the only reference to a gem is the did_you_mean
gem but there’s no reference to a base gems
directory. So, how do we tell Ruby where all the other gems are? If we don’t, it would raise LoadError
.
Here is where Rubygems comes into play. If you check the list of what Bundler does, you’ll notice it does not download the gems, when we run bundle install
it will use Rubygems to do that. Rubygems handles installation, uninstallation and activation of gems. When a gem is activated, Ruby will be able to find it.
Rubygems overrides the require
method of the Kernel module to activate gems when needed. We are not going into too much details here, the kernel override is really complex and out of scope of this article.
For what we need to know, the new method will first check if there’s a gem with that name in the directory Rubygems controls. If there’s a gem, Rubygems adds a new path to the $LOAD_PATH
array and then call the original require
method. The original method will find the file we were looking for since it’s now in the $LOAD_PATH
thanks to Rubygems (this action of adding a path to the array is the activation
of the gem).
This is our $LOAD_PATH
after requiring a gem:
#irb
2.6.6 :002 > require 'bundler'
=> true
2.6.6 :003 > pp $LOAD_PATH
["/Users/arielj/.rvm/rubies/ruby-2.6.6/lib/ruby/gems/2.6.0/gems/did_you_mean-1.3.0/lib",
"/Users/arielj/.rvm/gems/ruby-2.6.6/gems/bundler-2.1.4/lib",
"/Users/arielj/.rvm/rubies/ruby-2.6.6/lib/ruby/site_ruby/2.6.0",
"/Users/arielj/.rvm/rubies/ruby-2.6.6/lib/ruby/site_ruby/2.6.0/x86_64-darwin19",
"/Users/arielj/.rvm/rubies/ruby-2.6.6/lib/ruby/site_ruby",
"/Users/arielj/.rvm/rubies/ruby-2.6.6/lib/ruby/vendor_ruby/2.6.0",
"/Users/arielj/.rvm/rubies/ruby-2.6.6/lib/ruby/vendor_ruby/2.6.0/x86_64-darwin19",
"/Users/arielj/.rvm/rubies/ruby-2.6.6/lib/ruby/vendor_ruby",
"/Users/arielj/.rvm/rubies/ruby-2.6.6/lib/ruby/2.6.0",
"/Users/arielj/.rvm/rubies/ruby-2.6.6/lib/ruby/2.6.0/x86_64-darwin19"]
We can see that the second element of the array is now the path of bundler. But why is it loading version 2.1.4
and not a different one?
Requiring a Specific Version of a Gem
Rubygems will activate the newest version we have installed on our system. This can quickly became a problem:
- if I update a gem when working on a project, it will also change for other projects on my machine
- if another developer joins the project, that developer will have to download the gems with the same versions I used
- new gems version may not be compatible with other gems that my project depends on
This is, finally, where Bundler comes into play. All projects that uses Bundler will have a Gemfile
file (*) specifying the gems and version restrictions we need for each project, and also, after running Bundler, it will have a Gemfile.lock
file with the specific gem versions (or git commit hashes) Bundler calculated to make all the gems compatible.
(*)
Gemfile
is the default name, but can be changed, you could have a project with a different file name but with a file serving the same purpose
When executing Bundler, it will take care of reading this Gemfile.lock
file and will activate the specified versions of each gem! (i.e.: it will add the paths to the $LOAD_PATH
array). Now, when we require a gem, Ruby will find the gem and it will be the specified version. If it’s not found, it will fallback to the Rubygems require
method so we can still require gems that are not listed in our Gemfile.lock
file.
How to Use Bundler
Bundler can be used in two different ways:
- We can prefix our commands with
bundle exec
- We can run Bundler programmatically
Using bundle exec my_command
When we do this, Bundler will load before our script. It will read the Gemfile.lock
file, add all the paths for each gem into the $LOAD_PATH
array, and then it will execute my_command
. That way, our script will have the gems activated.
Running Bundler Programmatically
Bundler is a gem like any other, so we can require it inside our script and execute its require
method to make it load all the paths into the $LOAD_PATH
array when we want to:
# irb
2.6.6 :001 > require 'bundler'
=> true
2.6.6 :002 > Bundler.require
This is actually what Rails does. If we open the file config/application.rb
we can see something like this:
# config/application.rb
if defined?(Bundler)
...
Bundler.require(*Rails.groups(assets: %w[development test]))
...
end
But it’s not just Rails, the Hanami framework also uses this approach:
# https://github.com/hanami/hanami/blob/master/bin/hanami
require 'bundler'
...
::Bundler.require(:plugins) if File.exist?(ENV["BUNDLE_GEMFILE"] || "Gemfile")
...
This second method gives us the freedom to use Bundler if present and not use it if not, and it also saves use from having to use bundle exec
before every command.
Sometimes rails
Command is not Found
I just said that a Rails app calls Bundler.require
so adding the bundle exec
prefix is not needed, but probably we all had this issue where we want to run rails s
or rails c
and it won’t find the rails
command, and then we have to run it using bundle exec rails ...
anyway.
This happens because the system can’t find the rails
command. Similar to Ruby’s $LOAD_PATH
array, our system has a PATH
environment variable to look for the commands we want to run. bundle
executable is installed in the same directory as the ruby
executable, but rails
executable may be in a different one that’s not in the paths the PATH
env variable lists.
In those cases we have three options:
- add the missing path to the
PATH
env variable - prefix our commands with
bundle exec
- use the executables we may have in the bin folder of our project
Running bundle exec
and Bundler.require
at the same time is not a problem, so it’s safe to use bundle exec
even when not needed as long as there’s a Gemfile
in that directory, it won’t activate gems twice.
Bonus: How RVM works?
Since we are already talking about the PATH
env variable, let’s see what RVM does to change which Ruby version we want to use.
This is my PATH
when using Ruby 2.6.6 in bash:
# bash
% echo $PATH
/Users/arielj/.rvm/gems/ruby-2.6.6/bin:/Users/arielj/.rvm/gems/ruby-2.6.6@global/bin:/Users/arielj/.rvm/rubies/ruby-2.6.6/bin:/Users/arielj/.rvm/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Library/Apple/usr/bin
And after running rvm use 2.6.2
:
# bash
% rvm use 2.6.2
Using /Users/arielj/.rvm/gems/ruby-2.6.2
% echo $PATH
/Users/arielj/.rvm/gems/ruby-2.6.2/bin:/Users/arielj/.rvm/gems/ruby-2.6.2@global/bin:/Users/arielj/.rvm/rubies/ruby-2.6.2/bin:/Users/arielj/.rvm/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Library/Apple/usr/bin
We can see it simply changes the PATH
env variable to point to the Ruby version we want to use. Now, when we run the ruby
command, our system will find the executable inside one of those folders.
Conclusion
We learned how Bundler and Rubygems interact with each other and “trick” Ruby to help us have a consistent environment and all the problems this technique solves (there are similar solutions for other programming languages, like pip for Python, Composer for PHP, Yarn for NodeJs, etc).
We now have a better understanding to know when we would need to add the bundle exec
prefix when running commands and when not to save some time.
And as a bonus, we also learned how RVM uses a somewhat similar solution to help run any Ruby version in one system.