Sadik BakiuAug 19, 20238 min readMultiGPU Kubernetes Cluster for Scalable and Cost-Effective Machine Learning with Ray and KubeflowIntroduction Large Language Models (LLMs) are very much in demand right now, and they need a lot of compute power to train. Llama 1 used...
Bujar BakiuOct 14, 20225 min readDockerizing dbt Transformations for Managed Airflow: Docker, dbt, and GCP Cloud ComposerAirflow is one of the most popular pipeline orchestration tools out there. It has been around for more than 8 years, and it is used...
Kejdi TakoSep 14, 20223 min readDistributed Machine Learning Model Training with Spark (PySpark)GitHub repo: https://github.com/data-max-hq/pyspark-3-ways What is Spark? Apache Spark was designed to function as a simple API for...
Endri VeizajAug 29, 20226 min readServing Dog Breed Classification model with Seldon-Core, TensorFlow Serving and StreamlitGitHub Repo: https://github.com/data-max-hq/dog-breed-classification-ml In a modern Machine Learning workflow, after figuring out the...
IgliAug 24, 20224 min readDeploy Airflow and Metabase in Kubernetes using Infrastructure-as-CodeA step-by-step guide to deploying Airflow and Metabase in GCP with Terraform and Helm providers. With the extensive usage of cloud...
Megi MenallaJul 12, 20227 min readA hands-on project with dbt, Streamlit, and PostgreSQLData Engineering with dbt and streamlit. How to build a project with dbt, Streamlit and PostgresSQL.
Sadik BakiuApr 23, 20223 min readModern Data Team HatsThis blog was written together Martin Rusnak from Rusnak Consulting and Bujar Bakiu. Not that long ago (maybe somewhere this is still the...