Sadik BakiuAug 19, 20238 min readMultiGPU Kubernetes Cluster for Scalable and Cost-Effective Machine Learning with Ray and KubeflowIntroduction Large Language Models (LLMs) are very much in demand right now, and they need a lot of compute power to train. Llama 1 used...
Bujar BakiuOct 14, 20225 min readDockerizing dbt Transformations for Managed Airflow: Docker, dbt, and GCP Cloud ComposerAirflow is one of the most popular pipeline orchestration tools out there. It has been around for more than 8 years, and it is used...
Aldo SulaSep 19, 20226 min readOrchestrating Pipelines with DagsterA complete guide on how to integrate dbt with Dagster and an automated CI/CD pipeline to deploy on an AWS Kubernetes cluster This blog...
Kejdi TakoSep 14, 20223 min readDistributed Machine Learning Model Training with Spark (PySpark)GitHub repo: https://github.com/data-max-hq/pyspark-3-ways What is Spark? Apache Spark was designed to function as a simple API for...
Endri VeizajAug 29, 20226 min readServing Dog Breed Classification model with Seldon-Core, TensorFlow Serving and StreamlitGitHub Repo: https://github.com/data-max-hq/dog-breed-classification-ml In a modern Machine Learning workflow, after figuring out the...