THE POLARS DATAFRAME LIBRARY, BUT FOR RUBY

An example of what we’ll be doing in this article I was reading some random conversation threads on HackerNews the other day when I came across an article which announced that Polars had just been ported to the Ruby programming language. Now, unless you have been living under a rock for the past year or so, you probably know that Polars is a data manipulation library that was written entirely in Rust.

EXPLORING STRING DISTANCES WITH TYPESCRIPT AND TALISMAN

An example of what we’ll be doing in this article Most of the articles I write for this website are inspired by problems that I come across at work. Seeing large influx of spammy, and most likely automatically generated user content is pretty common. And to be fair, most social media platforms have become quite good at catching that type of content before it can even start causing harm to their userbase.

OPENAI'S WHISPER IS SO GOOD IT CAN TRANSCRIBE ANY SONG'S LYRICS

Though a quick Google search confirms that the first attempts at having computers identify and extract spoken language date back to the early 1950s, voice recognition technology has only recently been made accessible to the general public. Like most of my friends, I own a small Alexa device at home. And like them, I never use it. I was yet very excited when I discovered about OpenAI’s Whisper on Hacker News a little while ago.

TOPIC MODELLING VISUALISATION WITH ANYCHART.JS

An example of what we’ll be doing in this article Foreword: this post is dedicated to my workmate and friend Martin, who recently showed me some pretty cool stuff he has been doing with sankey charts Back in the early days of the 2020 pandemic, I got a bit bored at home and started thinking about creating a website. I remember that the first idea that I got was that I would write a couple of articles dedicated to topic modelling, and see where that would take me to.

ARQUERO: A GREAT DATAFRAME TOOLKIT FOR JAVASCRIPT

An example of what we’ll be doing in this article: Most open positions for data related jobs on any popular employment website will likely list Python or R as the languages that applicants must be skilled in. But hey, nobody leaves JavaScript in the corner! Data manipulation packages for the Node ecosystem have grown a lot over the past three or four years, to a point where they have become a credible alternative to using more popular Python or R based libraries such as Pandas or Dplyr.

EXPLAINING SENTIMENT SCORES WITH TRANSFORMERS AND SHAP

An example of what we’ll be doing in this article: Wouldn’t sentiment analysis be made easier if we could find a way to show which terms or chunks of terms within a given corpus contribute to the overall sentiment score of the corpus or some of its parts? I recently came across this pretty neat library named SHAP which amongst many other things provides some useful tools for explaining sentiment scores.

EXPLORING POS TAGS CO-OCCURRENCE WITH WINKNLP AND HIGHCHARTS.JS

An example of what we’ll be doing in this article: I’ve been playing around a lot with NeuralCoref lately, a pipeline extension for spaCy developed by Hugging Face. If you’re interested in coreference resolution, this article from Hugging Face’s Thomas Wolfe seems like a great place to start. Are we going to discuss neural coreferencing today? Absolutely not. If you head over to NeuralCoref GitHub page, your eyes will probably immediately feel drawn towards this very fancy visualisation that maps the semantic relationship between each terms within a short sentence:

CREATE A SIMPLE IN-BROWSER SQL PLAYGROUND WITH PYSCRIPT

An example of what we’ll be doing in this article Finding an online SQL playground that’s both free and user-friendly can be a little bit challenging. Most platforms, such as StrataScratch for instance, restrict what free tier users can do, while others hide the querying interface under layers or ads and pop-ups. That being said, it’s still possible to find a couple of high-quality solutions, and I personally really like Coderpad.

GOING BEYOND THE SENTIMENT SCORE, PART 1: SENTIMENT.JS

An example of what we’ll be doing in this article A good few years back, I used to work for a bank where part of my daily job was to monitor and evaluate the “happiness score” of our customers across several social media platforms, using a tool called Brandwatch. Amongst many other things, this platform offered its customers the ability to define a set of rules and add a corresponding sentiment tag to each and every mention of their brand or of any of their competitors.

TIME SERIES FORECASTING WITH META'S PROPHET

An example of what we’ll be doing in this article Please note that though I am currently employed by Meta, this article expresses my own views and wasn’t endorsed by my employer The past few years have seen the rise in popularity of new libraries whose purpose is to focus on ease of use and automation. If like me you have always been fascinated by time series forecasting, you must then be familiar with packages like Dart or PyCaret.