Posts

Comparing LLM Observability Tools: LangSmith, LangFuse, Lunary, and Helicone

Observability for LLM applications is critical. Whether you’re troubleshooting unexpected model outputs, tracking token usage and costs, or fine-tuning your prompt strategies, having the right observability tool can make all the difference. In this post, we compare four popular platforms— LangSmith , LangFuse , Lunary , and Helicone —to help you determine which fits your needs best. Why LLM Observability Matters LLM observability goes beyond classic infrastructure monitoring. With LLM apps, you need: Detailed tracing of prompt-to-response flows Evaluation metrics to monitor model performance and output quality Cost tracking for usage-heavy deployments Robust integration with your existing workflows (e.g., LangChain or other frameworks) As models become more complex and integrated into mission-critical applications, understanding these dimensions is essential for debugging, compliance, and performance optimization. Tool Ov...

Deploying Streamlit App on Azure App Service (without Docker) using Azure DevOps

Image
First create a web app on the Azure App Services.  I recommend to use Python >= 3.10 to prevent any issues. (e.g. with Python 3.9 the app didn't load properly) Because we don't use Docker, we select 'Code' for Publish and 'Linux' for Operating System. Once we have the app, we're ready to deploy the app using Azure DevOps pipeline. 1. Archive the code into a zip stages: - stage: Build displayName: Build dependsOn: [] jobs: - job: Build displayName: Build the function app steps: - task: UsePythonVersion@0 displayName: "Setting python version to 3.10 as required by functions" inputs: versionSpec: '3.10' architecture: 'x64' - task: ArchiveFiles@2 displayName: "Archive files" inputs: rootFolderOrFile: "$(System.DefaultWorkingDirectory)" includeRootFolder: false archiveFile: "$(System.DefaultWorkingDirector...

Data Quality Monitoring Tools

Data quality monitoring tools are essential for ensuring the accuracy and reliability of your data. With so many options on the market, it can be challenging to know which one to choose. In this post, we will compare five popular but different data quality monitoring tools: Soda, Great Expectations (GE), Re_data, Monte Carlo, and LightUp. Open Source Soda, GE, and Re_data are all open-source tools, while Monte Carlo and LightUp are not. All the tools are based on Python, except for Monte Carlo, which doesn't specify its base. Data Sources - In Memory Soda uses Spark for in-memory data sources, while GE uses pandas and Spark. Re_data doesn't specify, and Monte Carlo and LightUp don't support in-memory data sources. Data Sources - Database/Lake Soda supports athena, redshift, bigquery, postgresql, snowflake, while GE supports athena, bigquery, mssql, mysql, postgresql, redshift, snowflake, sqlite, and trino. Re_data supports dbt, and Monte Carlo supports snowflake, redshit, b...

Azure ML vs Databricks for deploying machine learning models

Azure Machine Learning (Azure ML) and Databricks Machine Learning (Databricks ML) are two popular cloud-based platforms for data scientists. Both offer a range of tools and services for building and deploying machine learning models at scale. In this blog post, we'll compare Azure ML and Databricks ML, examining their features and capabilities, and highlighting their differences.   Experimentation Azure ML The Python API allows you to easily  create experiments   that you can then track from the UI. You can do interactive runs from a Notebook. Logging metrics in this experiments still relies on the MlFlow client. Databricks ML Create experiments is easy also with the  MLFlow API    and  Databricks UI   . Tracking metrics is really nice with the MLFlow API (so nice that AzureML also uses this client for their model tracking). Winner They are both pretty much paired on this, although the fact that AzureML uses MLFlow (a Databricks product) maybe giv...