Maria Khalusova

Part of speech tagging with Hidden Markov Model in Kotlin

In NLP, part-of-speech tagging is a process in which you mark words in a text (aka corpus) as corresponding parts of speech (e.

Baseline Sentiment Analysis with Naive Bayes in Kotlin

The other weekend I implemented a simple sentiment classifier for tweets in Kotlin with Naive Bayes.

Machine Learning Model Evaluation Metrics part 3: Regression

In the final part of my series on ML model evaluation metrics we’ll talk about metrics that can be applied to regression problems.

Machine Learning Model Evaluation Metrics part 2: Multi-class classification

Hi! Welcome back to the second part of my series on different machine learning model evaluation metrics.

Machine Learning Model Evaluation Metrics part 1: Classification

If you’re in the beginning of your machine learning journey, you may be taking online courses, reading books on the topic, dabbling with competitions and maybe even starting your own pet projects.

Cleaning data with pandas

Whether you want to do an exploratory data analysis, or train a machine learning mode, the first thing you inevitably will have to do is clean the data you’ve got.

Pandas in anger

Pandas is an essential library in Data Scientist’s toolbox. If you’re just starting to learn, you’ll find a lot of great intro tutorials that’ll help you make your first steps with it.

Getting data, part 3: APIs

In this last part of “getting data” sub-series, I want to mention, without going into too much detail, one more way of obtaining data that you may need for your Data Science project.

Getting data, part 2: web scraping and the walking dead

In my previous blog post I’ve talked about getting data from a csv file (even if it’s messed up), or a database.

Getting data, part 1: reading a messy CSV, querying a database

Quite obviously, data science is not really possible without data. Before you can start munging your data, visualizing it, training models on it, you need to get your hands on it first.