Articles by Francois Buys

Machine Learning: An Introduction to Gradient Boosting

Welcome to the third article in our Machine Learning with Ruby series!

In our previous article Machine Learning: An Introduction to CART Decision Trees in Ruby, we covered CART decision trees and built a simple tree of our own. We then looked into our first ensemble model technique, Random Forests, in Machine Learning: An Introduction to Random Forests. It is a good idea to review that article before diving into this one.

Random Forests are great for a wide variety of cases, but there are also situations where they don’t perform quite as well. In this article we’ll take a look at another popular tree-based ensemble model: Gradient Boosting.

Read more »

Machine Learning: An Introduction to Random Forests

In our previous article Machine Learning: An Introduction to CART Decision Trees in Ruby, we covered CART decision trees and built a simple tree of our own. Decision trees are very flexible and are a good tool for simple classification, but they are often not enough when it comes to real-world scenarios.

When dealing with large and complex data, or when dealing with data with a significant amount of noise, we need something more powerful. That’s where ensemble models come into play. Ensemble models combine a number of weak learners to build a strong model, with increased accuracy and robustness. Ensembles also help manage and reduce bias and overfitting.

In this article, we’ll cover a very popular tree-based ensemble model: Random Forest.

Read more »

Machine Learning: An Introduction to CART Decision Trees in Ruby

In the middle of last year, we released an internal tool to help address a pretty significant issue. That is how the Pecas tool was born, and you can read about the Business Case for Pecas here.

Pecas relies on a binary classification machine learning model to classify time entries as valid or invalid. It is a combination of a Django app, that hosts the Slackbot and other data processing tasks, and a FastAPI app that hosts the machine learning model built using the Scikit-learn Python library. Scikit-learn provides a great set of classification models you can use, which are optimized and very robust, making it a solid choice to build your model. However, understanding the principles behind the classification can be a bit tricky, and machine learning models can feel a bit like a black box.

In this series, we’ll explore some principles of machine learning, namely binary classifiers, and walk through how they connect to each other, in Ruby. This article will focus on decision trees, namely CART (Classification And Regression Trees) and a little bit of the mathematics behind them.

Read more »

Send emails that thread in Rails

Email threads are great for improving the user experience of your app. In this post we will learn how the RFC 5322 specification expects us to thread emails. We will also learn that emails don’t always work as we expect them to. At the end of this post you will have email threading as another tool in your Rails belt.

Read more »

The I in SOLID

Today we will discuss the I in SOLID which, you may or may not know, represents the Interface Segregation Principle (ISP). This is the fourth article in the SOLID series. We have already discussed the Single Responsibility, Open/Closed and Liskov Substitution principles.

In this post we will discuss the value of and the process for crafting easy to maintain interfaces. If we have enough time we will also discuss how interfaces might apply to dynamically typed languages such as Ruby. With no further ado, let us start by finding out what an interface actually is.

Read more »

The L in SOLID

This post is the third one in the SOLID principles series. The first post discussed the single responsibility principle and in the second post we discussed the open / closed principle. Next, as the title suggests, we will take a look at the principle represented by the letter L from the SOLID acronym. L is for the Liskov Substitution Principle (LSP).

In simple terms LSP requires that supertypes and subtypes be swappable without affecting the correctness of a program.

Read more »

The O in Solid

In a previous post we considered the practical value of the Single Responsibility Principle (SRP). This is the second post in this series where we take a deeper look at each of the SOLID principles.

Robert C. Martin (a.k.a. “uncle Bob”) refers to the O in Solid as the heart of Object Oriented (OO) design. He goes so far as to say that this principle improves reusability and maintainability more than any other OO principle. You most likely already know that the O in SOLID belongs to the Open/Closed Principle (OCP).

We often hear about SRP or DRY (don’t repeat yourself) but seemingly less often about OCP. It turns out that this principle lays the foundation for many of the OO best practices. In this post we will talk about OCP and find out why uncle Bob is such an advocate of OCP.

Read more »