Articles about Machine Learning

Techniques to Enhance the Capabilities of LLMs for your Specific Use Case

With the advent of widely available Large Language Models (LLMs), businesses everywhere have sought to leverage these models to handle specific tasks that can increase productivity of their teams, automate specific tasks, increase the abilities of chat bots, among a variety of other things.

However, LLMs are not great at handling domain-specific tasks out of the box. In this article, we’ll explore a few different techniques to enhance the capabilities of LLMs and help them perform well for your specific use case.

Read more »

Machine Learning: An Introduction to Gradient Boosting

Welcome to the third article in our Machine Learning with Ruby series!

In our previous article Machine Learning: An Introduction to CART Decision Trees in Ruby, we covered CART decision trees and built a simple tree of our own. We then looked into our first ensemble model technique, Random Forests, in Machine Learning: An Introduction to Random Forests. It is a good idea to review that article before diving into this one.

Random Forests are great for a wide variety of cases, but there are also situations where they don’t perform quite as well. In this article we’ll take a look at another popular tree-based ensemble model: Gradient Boosting.

Read more »

Machine Learning: An Introduction to Random Forests

In our previous article Machine Learning: An Introduction to CART Decision Trees in Ruby, we covered CART decision trees and built a simple tree of our own. Decision trees are very flexible and are a good tool for simple classification, but they are often not enough when it comes to real-world scenarios.

When dealing with large and complex data, or when dealing with data with a significant amount of noise, we need something more powerful. That’s where ensemble models come into play. Ensemble models combine a number of weak learners to build a strong model, with increased accuracy and robustness. Ensembles also help manage and reduce bias and overfitting.

In this article, we’ll cover a very popular tree-based ensemble model: Random Forest.

Read more »

Pecas: Machine Learning Problem Shaping and Algorithm Selection

In our previous article, Machine Learning Aided Time Tracking Review: A Business Case we introduced the business case behind Pecas, an internal tool designed to help us analyse and classify time tracking entries as valid or invalid.

This series will walk through the process of shaping the original problem as a machine learning problem and building the Pecas machine learning model and the Slackbot that makes its connection with Slack.

In this first article, we’ll talk through shaping the problem as a machine learning problem and gathering the data available to analyse and process.

Read more »

Machine Learning: An Introduction to CART Decision Trees in Ruby

In the middle of last year, we released an internal tool to help address a pretty significant issue. That is how the Pecas tool was born, and you can read about the Business Case for Pecas here.

Pecas relies on a binary classification machine learning model to classify time entries as valid or invalid. It is a combination of a Django app, that hosts the Slackbot and other data processing tasks, and a FastAPI app that hosts the machine learning model built using the Scikit-learn Python library. Scikit-learn provides a great set of classification models you can use, which are optimized and very robust, making it a solid choice to build your model. However, understanding the principles behind the classification can be a bit tricky, and machine learning models can feel a bit like a black box.

In this series, we’ll explore some principles of machine learning, namely binary classifiers, and walk through how they connect to each other, in Ruby. This article will focus on decision trees, namely CART (Classification And Regression Trees) and a little bit of the mathematics behind them.

Read more »

Machine Learning Aided Time Tracking Review: A Business Case

As an agency, our business model revolves around time. Our client activities rely on a dedicated number of hours per week worked on a project, and our internal activities follow the same pattern. As such, time tracking is a vital part of our work. Ensuring time is tracked correctly, and time entries meet a minimum quality standard, allows us to be more data-driven in our decisions, provide detailed invoices to our clients and better manage our own projects and initiatives.

Despite being a core activity, we had been having several issues with it not being completed or not being completed properly. A report we ran at the end of 2022 showed our time tracking issues were actually quite severe. We lost approximately one million dollars in 2022 due to time tracking issues that led to decisions made on poor data. It was imperative that we solved the problem.

To help with this issue, we created an evolution of our Pecas project. We turned Pecas into a machine learning powered application capable of alerting users of issues in their time entries. In this article, we’ll talk though the business case behind it and expected benefits to our company.

Read more »