Articles about Ruby

Machine Learning: An Introduction to Gradient Boosting

Welcome to the third article in our Machine Learning with Ruby series!

In our previous article Machine Learning: An Introduction to CART Decision Trees in Ruby, we covered CART decision trees and built a simple tree of our own. We then looked into our first ensemble model technique, Random Forests, in Machine Learning: An Introduction to Random Forests. It is a good idea to review that article before diving into this one.

Random Forests are great for a wide variety of cases, but there are also situations where they don’t perform quite as well. In this article we’ll take a look at another popular tree-based ensemble model: Gradient Boosting.

Read more »

Machine Learning: An Introduction to Random Forests

In our previous article Machine Learning: An Introduction to CART Decision Trees in Ruby, we covered CART decision trees and built a simple tree of our own. Decision trees are very flexible and are a good tool for simple classification, but they are often not enough when it comes to real-world scenarios.

When dealing with large and complex data, or when dealing with data with a significant amount of noise, we need something more powerful. That’s where ensemble models come into play. Ensemble models combine a number of weak learners to build a strong model, with increased accuracy and robustness. Ensembles also help manage and reduce bias and overfitting.

In this article, we’ll cover a very popular tree-based ensemble model: Random Forest.

Read more »

Machine Learning: An Introduction to CART Decision Trees in Ruby

In the middle of last year, we released an internal tool to help address a pretty significant issue. That is how the Pecas tool was born, and you can read about the Business Case for Pecas here.

Pecas relies on a binary classification machine learning model to classify time entries as valid or invalid. It is a combination of a Django app, that hosts the Slackbot and other data processing tasks, and a FastAPI app that hosts the machine learning model built using the Scikit-learn Python library. Scikit-learn provides a great set of classification models you can use, which are optimized and very robust, making it a solid choice to build your model. However, understanding the principles behind the classification can be a bit tricky, and machine learning models can feel a bit like a black box.

In this series, we’ll explore some principles of machine learning, namely binary classifiers, and walk through how they connect to each other, in Ruby. This article will focus on decision trees, namely CART (Classification And Regression Trees) and a little bit of the mathematics behind them.

Read more »

Handling Environment Variables in Ruby

Configuring your Rails application can be tricky. How do you define secrets? How do you set different values for your local development and for production? How can you make it more maintainable and easy to use?

Using environment variables to store information in the environment itself is one of the most used techniques to address some of these issues. However, if not done properly, the developer experience can deteriorate over time, making it difficult to onboard new team members. Security vulnerabilities can even be introduced if secrets are not handled with care.

In this article, we’ll talk about a few tools that we like to use at OmbuLabs and ideas to help you manage your environment variables efficiently.

Read more »

Unit Testing our Design Patterns exercise

So, our little exercise in design patterns is getting quite messy. Which is ironic, considering it’s an exercise in design patterns.

The reason is that I’m mostly trying to be very focused on the Design Patterns book and just fleshing out the example implementations they provide.

Therefore, in order to organize things, I believe this is the right time to add unit tests. As a plus, I also get to test my little gem in an automated fashion.

Here I’ll only go through the RandomMazeBuilder class since it would be quite lengthy to go through every single file. To see all the other specs, just checkout the repo.

Read more »

Design Patterns in Ruby - The Builder

In the last part of this series, we left our little maze game gem generating random mazes that had random kinds of rooms using the abstract factory pattern.

While that was good enough, in the case of the our maze game, it turns out mazes can be pretty complex objects, being a collection of rooms, doors and walls of many types. And even though I didn’t add too much variety of these components, things could get pretty convoluted.

It turns out there’s a pattern just for these cases: the Builder Pattern.

Read more »

The what, the why, and the how of Bloom Filter

Have you ever wondered how does Medium recommend blogs to read or how does a platform with millions of users tells if a username is available or taken? If yes, you have come to the right place, as we are going to look at the data structure that makes this and a lot more happen. The data structure is Bloom Filter.

Read more »

Understanding Bundler - To `bundle exec` or not? that is the question

We, Ruby developers, are used to running scripts or commands with the prefix bundle exec, but sometimes it’s not needed, but sometimes it is, and when it’s not needed it still works just fine if we add it. So it may not be clear why we need to use it in some cases.

In this blogpost I’ll try to answer these questions with a little insight on what Bundler (and Ruby and Rubygems) do.

Read more »

Exploring Method Arguments in Ruby: Part 3

In the first and second parts of this series we talked about positional and keyword arguments, but we still have some extra options so that our methods can do anything we want.

In this final part we are going to explore blocks, array decomposition, partially applied methods and a few syntax tricks. We’ll also study a few known methods to understand how everything is used in real world applications.

Read more »

Exploring Method Arguments in Ruby: Part 1

Ruby is an object oriented language where everything is an object (even methods are objects of the class Method!), so everything we need to do is done by calling methods on objects. That also means that methods have to provide a lot of flexibility because they are used everywhere.

Ruby provides a lot of options to pass arguments to our methods, so we’ll make this topic a series so it’s not too long. We’ll split the options into different categories and then break down everything with some examples and/or use cases.

Read more »

How to dynamically update your sitemap in Ruby

A few months ago I received the task of making the FastRuby.io sitemap refresh automatically after each deploy. That sounds like it would be pretty straightforward if we didn’t have one issue (it’s never that easy, right?). For the FastRuby.io blog we created a gem that encapsulates a Jekyll application. The discussion of why do we have a gem for our blog is actually a good topic for a new post. For now, I want to focus on the sitemap task that I had.

Since the blog is a gem, we also need to make sure that whatever tool we use to generate the sitemap covers new blog posts.

In this article I’ll show you my journey to figure out how to make everything work together.

Read more »

Blogcop: A GitHub app that helps you manage your Jekyll blog

At OmbuLabs we use Jekyll to generate our blog. If you are not familiar with it, here is a quick description from the Jekyll site:

“Jekyll is a simple, extendable, static site generator. You give it text written in your favorite markup language and it churns through layouts to create a static website. Throughout that process you can tweak how you want the site URLs to look, what data gets displayed in the layout, and more.”

Read more »

Code Conventions and Rubocop

Everyone has had the experience of working on a gnarly, difficult to understand code-base. The sort of code base that makes you hate your job. Often it comes down to poor design, but code conventions also play a large part in whether you wake up dreading your job in the morning. The overall design (choice of design patterns and how modules and classes are organized and factored) is the long range, big picture strategy of how an application will be made. Code conventions, by contrast, come down to the choices you make about which constructs of a language you use, which you don’t, and when.

Read more »

Trial and Error with Ruby Koans

After reading introductory primers and watching instructional videos for beginners learning Ruby, I needed a program that would allow me to test out Ruby concepts for myself. I began practicing Ruby Koans, a testing-based program that runs through a variety of Ruby concepts in a trial and error fashion.

Read more »

Announcing AfipBill

If you live in Argentina and you ever use AFIP, you should already know that their platform is not the best in terms of user friendliness. We wanted to integrate OmbuShop with AFIP (using their API) in order to generate and print the bills for each seller. Unfortunately, there is no way to do this because the API doesn’t generate a printable version (PDF) of the bill.

Read more »

How to use any gem in the Rails production console

How many times did you come across a great gem you wanted to try out in a production console, like benchmark-ips or awesome-print?

Be it for performance or for readability, sometimes it’s nice to be able to try out something new quickly without going through a pull request + deployment process. This is possible by modifying the $LOAD_PATH Ruby global variable and requiring the gem manually.

Read more »

Brief look at RSpec's formatting options

A few weeks ago, I noticed weird output in the RSpec test suite (~4000 tests) for a Rails application:

.............................................................................................unknown OID 353414: failed to recognize type of '<field>'. It will be treated as String  ...........................................................................................................................................

This Rails app uses a PostgreSQL database. After some Googling, it turns out that this is a warning from PostgreSQL. When the database doesn’t recognize the type to use for a column, it casts to string by default.

Read more »

Spy vs Double vs Instance Double

When writing tests for services, you may sometimes want to use mock objects instead of real objects. In case you’re using ActiveRecord and real objects, your tests may hit the database and slow down your suite. The latest release of the rspec-mocks library bundled with RSpec 3 includes at least three different ways to implement a mock object.

Let’s discuss some of the differences between a spy, a double and an instance_double. First, the spy:

[1] pry(main)> require 'rspec/mocks/standalone'
=> true
[2] pry(main)> user_spy = spy(User)
=> #<Double User>
[3] pry(main)> spy.whatever_method
=> #<Double (anonymous)>
Read more »

A comprehensive guide to interacting with IMAP using Ruby

A few times in the past I’ve had to interact with IMAP via Ruby, and wrapping your head around its API is not so easy. Not only is the IMAP API a bit obscure and cryptic, but Ruby’s IMAP documentation is not so great either.

Searching the internet for examples doesn’t yield too many results, so I’ll try to write down some of the things I’ve learned. The examples I’ll show use Gmail as the target IMAP server.

Read more »

Use session variables to optimize your user flow

Sessions provide you a nice little data storage feature where the application does not need to get the information directly from the database. So you do not have to persist data in your database and can easily store info about the user on the fly. This is a nice way to enhance the user experience on your page.

Let’s say that you want to show some users a new fancy sign up form and the rest the old form. If you store the version of the sign up form in a session variable, you don’t need to persist this info in your database.

Read more »

Let vs Instance Variables

Maybe in the past you stumbled over the two different approaches to setup your test variables. One way is the more programmatical approach by using instance variables, usually initialized in a before block.

Read more »

Why using default_scope is a bad idea

default_scope is a method provided by ActiveRecord, which allows you to set a default scope (as its name implies) for all operations done on a given model. It can be useful for allowing soft-deletion in your models, by having a deleted_on column on your model and setting the default scope to deleted_on: nil

Read more »

Enumerable#grep vs Enumerable#select

Often, Enumerable#select is the chosen method to obtain elements from an Array for a given block. Without thinking twice, we may be doing more work than necessary by not taking advantage of another method from the Enumerable module, Enumerable#grep.

Read more »