What I do
I'm a data engineer with a solid background in back-end development and DevOps, I have over 10 years experience in building data-intensive applications for clients. Below is a quick overview of my main technical skill sets and technologies I use. Want to find out more about my experience? Check out my online resume.
Python, Golang, Java & C++
I've been coding in python for more than 8 years as my main coding language for various projects like data wrangling, data science and web scraping.
I have using Golang for over a year in building Micro-Services and automation. I've also used also Java and C++ for mission-critical projects
like data ingestion pipelines and building blockchain smart contracts.
Apache airflow, oozie & talend
Apache Airflow is my favourite python framework when it comes to building a complex and highly scalable data ingestion pipelines.
I also use Oozie when I work on yarn and Hadoop clusters and sometimes I use Talend for some traditional ETL BI needs.
Apache Spark, hadoop & Kafka
When it comes to manipulating a large amount of data, Spark is well suited to this kind of task. I have used this toolset for various mission-critical and Petabyte-Scale data projects.
PostgreSQL, MySql & Hive
I have used relational databases for different projects, wheater as a backend for web apps and APIs or multidimensional data warehouses.
I have also used apache Hive to build data warehouses on top of HDFS clusters.
MongoDB, Cassandra & Neo4J
I have used multiples NoSql databases as well as Graph Databases for different projects. whether as a backend for applications or in data ingestion pipelines and analytical purposes.
Jenkins, Docker & Kubernetes
I use Jenkins as my primary CICD tool for basically every project I work on, including this website.
The same thing applies for Docker and Kubernetes, it's been more than 3 years, this stack has become my daily work routine.
Grafana, QlikSense & Tableau
I have worked with different dashboarding tools, depending on the usecase, I have used Prometheus and Grafana mainly for system and application monitoring, and Tableau, QlikSense for reporting purposes.
React, Javascript, HTML & CSS
Frontend development is not my strongest suit!! However, I worked with web technologies on several projects.
Latest Blog Posts
Making sense of Twitter data with NLP
Using Natural Language Processing (NLP), and Scikit-Learn, we will make a basic sentiment classification model from the data we have collected ...
Pre-processing Twitter Data using Python
In this blog post, I’ll show you how to clean the data we have extracted from Twitter in the previous article, and prepare it to do some analysis ...
Collecting data from Twitter REST Search API using Python
Using Python and Twython library, I’ll show you in this blog post how to connect to Twitter Search API, submit search queries and extract the results ...