MLflow Deep Dive: Tracking, Registry, Serving (2026)

MLflow is an open-source platform that manages the entire machine learning lifecycle — experiment tracking, model management, and deployment — in one place. Development is led by Databricks, and on Databricks it is available as a fully Managed offering that integrates seamlessly with the rest of the platform. It is a core topic on the ML Associate and Professional exams, and the basic concepts also appear on the Data Engineer exams.

The 4 MLflow Components

Component	Role	Exam Importance
MLflow Tracking	Logs experiment parameters, metrics, and artifacts	Critical
MLflow Models	Packages models in a standard format	Important
Model Registry	Model versioning and stage transitions	Critical
MLflow Projects	Packages ML code for reproducibility	Rarely tested

MLflow Tracking

Tracking records who ran what — which parameters, on which data, when, and with what results. A single training run is called a Run, and multiple Runs are grouped into an Experiment.

import mlflow

# Experimentの指定（なければ自動作成）
mlflow.set_experiment("/Users/taro/churn_prediction")

# Runの開始
with mlflow.start_run(run_name="xgboost_v1"):
    # パラメータの記録
    mlflow.log_param("max_depth", 5)
    mlflow.log_param("learning_rate", 0.01)
    mlflow.log_param("n_estimators", 100)

    # モデル訓練（例）
    model = train_xgboost(X_train, y_train, max_depth=5, lr=0.01)

    # メトリクスの記録
    accuracy = evaluate(model, X_test, y_test)
    mlflow.log_metric("accuracy", accuracy)
    mlflow.log_metric("f1_score", f1)

    # モデルの保存
    mlflow.sklearn.log_model(model, "model")

    # アーティファクトの保存（図表、設定ファイルなど）
    mlflow.log_artifact("confusion_matrix.png")

On Databricks, a single call to mlflow.autolog() automatically logs parameters, metrics, and models for many frameworks (scikit-learn, XGBoost, LightGBM, PyTorch, TensorFlow, and more).

# Autologging（推奨）
mlflow.autolog()

# これだけで訓練時のパラメータ・メトリクス・モデルが自動記録される
model = RandomForestClassifier(n_estimators=100, max_depth=5)
model.fit(X_train, y_train)

Model Registry

The Model Registry handles versioning and lifecycle management for trained models. It centralizes which model version sits in which stage and lets you build approval workflows for production deployment.

# RunからModel Registryに登録
mlflow.register_model(
    model_uri="runs:/abc123def456/model",
    name="churn_prediction_model"
)

# バージョンの遷移（Unity Catalog Model Registryではエイリアスを使用）
# 旧Model Registry: Staging → Production
# Unity Catalog: エイリアス "champion" / "challenger" を設定
from mlflow import MlflowClient
client = MlflowClient()
client.set_registered_model_alias(
    name="churn_prediction_model",
    alias="champion",
    version=3
)

Item	Legacy Workspace Model Registry	Unity Catalog Model Registry
Scope	Per workspace	Account-wide (shared across workspaces)
Stage Management	None / Staging / Production / Archived	Aliases (champion / challenger, etc.)
Governance	Workspace-level permissions	Unity Catalog's 3-level permission model
Lineage	Limited	Automatic table → model → serving lineage
Recommendation	Legacy (being phased out)	Recommended for new projects

Model Deployment

Models registered in the Model Registry can be deployed as real-time inference endpoints via Databricks Model Serving. For batch inference, use mlflow.pyfunc.spark_udf() to load the model as a Spark UDF.

# バッチ推論: Spark UDFとしてモデルを適用
import mlflow
model_uri = "models:/churn_prediction_model@champion"
predict_udf = mlflow.pyfunc.spark_udf(spark, model_uri)

predictions = (spark.table("silver.customers")
  .withColumn("churn_probability", predict_udf("age", "tenure", "monthly_charges"))
)

Practical Patterns for Managing Experiments

Name Experiments with project name + model purpose (e.g. /Projects/churn/xgboost)
Include the method and version in the Run name (e.g. xgboost_v3_tuned)
Log the data version (Delta Table version number) as a parameter as well
Use the Compare feature to view metrics side-by-side across Runs
Register the best Run in the Model Registry and mark it as the champion alias

What the Exam Asks About

The hierarchy between Experiment, Run, and Model
When to use mlflow.log_param() / log_metric() / log_model() / log_artifact()
How mlflow.autolog() behaves (supported frameworks, what gets logged automatically)
Model Registry stage transitions (legacy: Staging→Production, new: aliases)
Differences between Unity Catalog Model Registry and the legacy Workspace Model Registry
Batch inference patterns using spark_udf()

Check Your Understanding

ML Associate / Professional

問題 1

An ML engineer wants to train models across multiple hyperparameter configurations and deploy the best one to production. Which sequence implements this workflow most efficiently with MLflow?

Enable mlflow.autolog() and train across configurations → compare Runs in the MLflow UI → register the best Run's model in the Model Registry → set the champion alias and deploy via Model Serving
Save each model manually as a Pickle file → upload to S3 → deploy via a Lambda function
Log every hyperparameter combination in a CSV → expose the best model directly from a notebook via REST API
Manually INSERT each Run's metrics into a Delta Table → find the best Run with SQL → copy the model file by hand

正解: A

The standard workflow is: autolog parameters, metrics, and models with MLflow autologging, compare in the UI, manage in the Model Registry, and deploy via an alias. Manual Pickle saves or CSV logging sacrifices reproducibility and traceability.

Frequently Asked Questions

Can MLflow be used outside of Databricks?

Yes. MLflow is an open-source project (under the Linux Foundation) and can be used in any environment, including local machines, AWS SageMaker, GCP Vertex AI, and Azure ML. On Databricks, however, you get extra features such as Unity Catalog integration (model governance), autologging, and a Managed MLflow Tracking Server.

I don't understand the relationship between Experiment, Run, and Model.

An Experiment is a container (folder) for experiments, a Run is an individual trial inside it (a single model training), and a Model is the artifact you ultimately deploy. For example, inside a "churn prediction" Experiment you might have a "Random Forest, lr=0.01" Run and an "XGBoost, depth=5" Run, then register the best Run's model in the Model Registry and deploy it.

How does MLflow appear on the exams?

MLflow is the single most frequent topic on the ML Associate and Professional exams. Key areas include Tracking (logging parameters, metrics, and artifacts), Model Registry (model versioning and stage transitions), using MLflow without autologging, and Unity Catalog Model Registry. Basic MLflow concepts also appear on the Data Engineer exams.

Check what you learned with practice questions

Practice with certification-focused question sets

無料で問題を解いてみる

Author

NicheeLab Editorial Team

NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.

MLflow Complete Guide: Experiment & Model Management on Databricks