Julien's data blog

EXPLORE YOUR DATAFRAMES WITH PYGWALKER

I know quite a few people who will do anything to avoid having to work with a Pandas dataframe. Funnily enough, they’re usually much better programmers than I will ever be. But they’re stuck in some sort of Catch-22-like situation where they can’t memorise Pandas’ most basic functionalities because they never use the library at all, which in turn makes them even more relunctant to try and manipulate dataframes as they can’t remember which methods and attributes to use.

Tue, Jul 9, 2024

BUN: A GREAT JAVASCRIPT RUNTIME FOR DATA PRACTITIONERS

An example of what we’ll be discussing in this article If like me you’re doing your best to keep up with the tech industry in general, you’ve probably noticed that web development has become incredibly complex and difficult to follow over the past 6 to 7 years. Not many people seem to write vanilla JavaScript anymore, and navigating through the various existing runtimes (Node, Deno) or frontend frameworks (React, Vue, etc.

Fri, May 24, 2024

EASY IN-BROWSER EXPLORATION OF SMALL CSV FILES WITH WEBDATAROCKS

An example of what we’ll be doing in this article I’ve been meaning to share some thoughts on WebDataRocks for a while now, as it’s helped me find a solution to a fairly minor technical challenge I stumbled upon a few months ago. To add in a bit of context, exploring the vast online world from a corporate device can be a bit of a hit-and-miss experience. Take the official Irish Data Portal for instance.

Sun, Apr 21, 2024

BOOK REVIEW: JAVASCRIPT FOR DATA SCIENCE (2020)

Disclaimer: all screenshots were produced from the digital version of the book, but I do own a physical copy of it. Though most entries on this website usually consist of hands-on guides and random programming tutorials, I will from now on try to share some books or articles that I have read and found interesting. I remember first hearing about JavaScript for Data Science on Twitter, as I was looking for a Pandas-like package for the Node.

Sun, Mar 17, 2024

DICTIONARY APIS FOR THE ENTHUSIASTIC LINGUIST: AN OVERVIEW

An example of what we’ll be doing in this article For anybody who’s ever worked with textual data, the past 4 or 5 years have been an absolute blast. Since the publication of Attention Is All You Need in 2017, the field of natural language processing has seen new frameworks, libraries, and concepts coming up on a regular basis. Take sentiment analysis for instance. For years, available solutions were limited to rule-based models like VADER or AFINN.

Thu, Feb 15, 2024

NEW YEAR, ALMOST NEW ME

An example of what we’ll be discussing in this article New year’s resolutions and I haven’t always been the best of friends. For a long time, the concept of committing to doing something for a whole year, while being totally clueless about what I’d be doing even 2 months later felt like peak stupidity. I mean, the whole thing just seemed absurd, as I perfectly knew that the aspirations and centres of interest of my future self would simply no longer match those that I had at the time of making that decision.

Sun, Jan 14, 2024

ADVENT OF CODE 2023: DAYS 1 AND 2

An example of what we’ll be doing in this article Yes, it’s this time of the year again! While we’re all enjoying a few weeks of festive activities and a bit of well-deserved quality time with our loved ones, some of us are deliberately choosing to spend this time solving some random programming challenges. Now if you ask me, what I really like about Advent of Code, is the creative and fun ways that some programmers approach each new puzzle.

Tue, Dec 26, 2023

TEXT SUMMARISATION IN TYPESCRIPT WITH TRANSFORMERS.JS

An example of what we’ll be doing in this article If you’re a long-time follower of this website, you probably know by now how much I’ve been advocating for the use of JavaScript (and TypeScript) as a second language for any data practitionner that might want to broaden their horizon and learn some new and useful skills. I was therefore very excited when HugginFace recently announced that they would soon be porting their state-of-the-art transformers libraries to the JavaScript ecosystem.

Sat, Nov 25, 2023

SIMPLIFY WEBSITE SCRAPING WITH TRAFILATURA

Below is an example of what we’ll be doing in this article: In early 2022, I wrote a very basic Python program to scrape some articles from an Irish website named The Journal. Long story short, all I needed at that time was to capture the content of Covid-related articles as well as their attached user comments, and attempt to train a model on that data. A bit less than six months later, that simple .

Sun, Oct 29, 2023

STARBOARD.GG, AND OTHER NOTEBOOK ENVIRONMENTS FOR NON-PYTHON DATA SCIENCE: PART I

Have you ever wondered what makes a language be a good fit for a particular space or not? Its design choices, overall syntax, and to a lesser extent speed and performance are arguably some of the first elements that you’ll likely hear when asking this question around. I personally think that tooling and the landscape of existing dependencies also play a huge role in the adoption of a given language by a specific community.

Mon, Oct 2, 2023