Hyperparameter tuning is the process of optimizing parameters that aren't learned during training — learning rate, tree depth, regularization strength, and so on. On Databricks, Hyperopt (distributed search via SparkTrials) and Optuna are the main tools, and every trial can be automatically recorded in MLflow Tracking. The ML Associate exam frequently tests the basic Hyperopt API, while ML Professional often asks when to use SparkTrials vs Trials.
The algorithms used to find the best combination from a hyperparameter search space fall into three broad categories.
| Strategy | How it works | Pros | Cons | Representative tools |
|---|---|---|---|---|
| Grid Search | Exhaustively tries every specified combination | Highly reproducible; effective when there are few parameters | Search space grows exponentially (curse of dimensionality) | scikit-learn GridSearchCV |
| Random Search | Samples randomly from the search space | Tends to find good solutions with fewer trials than Grid Search | Allocates resources equally to unimportant parameters | scikit-learn RandomizedSearchCV |
| Bayesian Optimization | Builds a probabilistic model (such as TPE) from past trials and predicts the next point to try | Converges to the optimum with fewer trials; handles high-dimensional spaces well | Has sequential dependencies, so pure parallelization requires careful design | Hyperopt (TPE), Optuna (TPE) |
For both practical work and the exam, the most important strategy on Databricks is Bayesian Optimization (TPE: Tree-structured Parzen Estimator). Hyperopt and Optuna both default to TPE, and they can find strong solutions in high-dimensional spaces with roughly 50-200 trials.
Hyperopt is the Bayesian optimization library built into Databricks. You can run tuning simply by passing an objective function, a search space, an algorithm, and a maximum number of trials to fmin().
| Function | Use case | Example |
|---|---|---|
hp.choice(label, options) | Categorical values (discrete choices) | hp.choice("algo", ["rf", "xgb", "lgb"]) |
hp.uniform(label, low, high) | Uniform distribution (continuous values) | hp.uniform("dropout", 0.1, 0.5) |
hp.loguniform(label, low, high) | Log-uniform distribution (parameters that span orders of magnitude, e.g. learning rate) | hp.loguniform("lr", log(1e-5), log(1e-1)) |
hp.quniform(label, low, high, q) | Quantized uniform distribution (integer parameters) | hp.quniform("max_depth", 3, 15, 1) |
from hyperopt import fmin, tpe, hp, STATUS_OK, SparkTrials
import mlflow
import numpy as np
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import cross_val_score
# Define the search space
search_space = {
"n_estimators": hp.quniform("n_estimators", 50, 500, 50),
"max_depth": hp.quniform("max_depth", 3, 15, 1),
"min_samples_split": hp.quniform("min_samples_split", 2, 20, 1),
"learning_rate": hp.loguniform("learning_rate", np.log(1e-4), np.log(1e-1)),
}
# Objective function (returns the value to minimize)
def objective(params):
params["n_estimators"] = int(params["n_estimators"])
params["max_depth"] = int(params["max_depth"])
params["min_samples_split"] = int(params["min_samples_split"])
clf = RandomForestClassifier(**params, random_state=42)
score = cross_val_score(clf, X_train, y_train, cv=3, scoring="f1").mean()
# Log metrics to MLflow
mlflow.log_metrics({"f1_cv": score})
# Return STATUS_OK and loss (loss is minimized)
return {"loss": -score, "status": STATUS_OK}
# Run distributed with SparkTrials
spark_trials = SparkTrials(parallelism=8)
with mlflow.start_run(run_name="hyperopt_rf_tuning"):
best_params = fmin(
fn=objective,
space=search_space,
algo=tpe.suggest, # TPE (Bayesian Optimization)
max_evals=100, # up to 100 trials
trials=spark_trials, # run in parallel on Spark executors
)The objective function must return a dict in the form {"loss": value, "status": STATUS_OK}. Because loss is the value fmin minimizes, return a negative value (-score) when you want to maximize accuracy.
The Trials class choice determines Hyperopt's execution mode. Whether you tap into the cluster's resources makes a huge difference in performance.
| Item | Trials | SparkTrials |
|---|---|---|
| Execution location | Driver node (single machine) | Spark executors (entire cluster) |
| Parallelism | Sequential execution only | Controlled by the parallelism parameter |
| Suitable models | Single-machine ML such as scikit-learn | Single-machine ML such as scikit-learn (each executor runs independently) |
| MLflow integration | Manual logging required | Each trial is automatically logged as a nested run |
| Recommended parallelism | — | Match the cluster's worker count, or use the square root of max_evals |
| Caveats | Even 100 trials run sequentially on a single driver | Too-high parallelism erodes TPE's sequential-optimization advantage |
SparkTrials' parallelism involves a tradeoff. Higher values increase raw parallelism, but TPE uses past results to choose the next point. With parallelism too high, you end up choosing the next point while many in-flight trials haven't returned yet, which approaches random search. In practice, roughly the square root of max_evals is the recommended setting.
Optuna is a Bayesian optimization framework developed by Japan-based Preferred Networks. Unlike Hyperopt, it supports pruning (early termination) out of the box, letting you abort unpromising trials mid-flight to cut compute costs.
| API | Role |
|---|---|
optuna.create_study(direction) | Create an optimization study ("minimize" or "maximize") |
study.optimize(objective, n_trials) | Run the specified number of optimization trials |
trial.suggest_int(name, low, high) | Search an integer parameter |
trial.suggest_float(name, low, high, log) | Search a float parameter (log=True for log scale) |
trial.suggest_categorical(name, choices) | Search categorical values |
trial.report(value, step) | Report intermediate values (used for pruning decisions) |
trial.should_prune() | Pruning check (True means terminate early) |
import optuna
import mlflow
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.model_selection import cross_val_score
def objective(trial):
params = {
"n_estimators": trial.suggest_int("n_estimators", 50, 500),
"max_depth": trial.suggest_int("max_depth", 3, 15),
"learning_rate": trial.suggest_float("learning_rate", 1e-4, 1e-1, log=True),
"subsample": trial.suggest_float("subsample", 0.5, 1.0),
"min_samples_split": trial.suggest_int("min_samples_split", 2, 20),
}
clf = GradientBoostingClassifier(**params, random_state=42)
score = cross_val_score(clf, X_train, y_train, cv=3, scoring="f1").mean()
# Log each trial to MLflow
with mlflow.start_run(nested=True):
mlflow.log_params(params)
mlflow.log_metric("f1_cv", score)
return score
with mlflow.start_run(run_name="optuna_gbm_tuning"):
study = optuna.create_study(direction="maximize")
study.optimize(objective, n_trials=100)
# Record the best parameters
mlflow.log_params(study.best_params)
mlflow.log_metric("best_f1", study.best_value)| Item | Hyperopt | Optuna |
|---|---|---|
| Search algorithms | TPE, Random Search, Adaptive TPE | TPE, CMA-ES, Random Search, Grid Search, GP |
| Distributed support on Databricks | Native support via SparkTrials | Can be parallelized with Joblib etc., but no Spark integration |
| Pruning (early termination) | Not supported out of the box | MedianPruner, HyperbandPruner, and others built in |
| Objective return value | A dict containing loss (to minimize) and STATUS_OK | A scalar value (direction selects maximize/minimize) |
| MLflow integration | Automatic logging when using SparkTrials | Manual mlflow.start_run(nested=True) |
| Relation to AutoML | Engine behind Databricks AutoML | Not used by AutoML |
| Visualization | Use the MLflow UI | Visualize the search process with optuna.visualization |
| Exam importance | Frequently tested on ML Associate and ML Professional | Rarely tested directly, but useful for conceptual understanding |
Databricks AutoML automatically performs preprocessing, feature engineering, model selection, and hyperparameter tuning once you hand it data. Internally it uses Hyperopt's TPE algorithm to search hyperparameters for each model, and every trial is automatically logged to an MLflow experiment.
When you use Hyperopt with SparkTrials, each trial is automatically logged as a child run nested under the parent run. That lets you compare parameters and metrics across all trials in the MLflow UI.
| Exam | Scope | Key points |
|---|---|---|
| ML Associate | Hyperopt basics | Meaning and usage of fmin, hp.choice, hp.loguniform, and STATUS_OK |
| ML Associate | Differences between search strategies | Characteristics of Grid Search vs Random Search vs Bayesian Optimization |
| ML Professional | SparkTrials vs Trials | Difference between distributed and single-machine execution; setting parallelism |
| ML Professional | MLflow integration | Automatic nested-run logging with SparkTrials and how to compare results |
| ML Professional | Relationship with AutoML | The fact that AutoML uses Hyperopt internally; using generated notebooks |
ML Professional
問題 1
An ML engineer wants to run 100 hyperparameter tuning trials on a scikit-learn random forest on an 8-worker Databricks cluster. Which approach best maximizes cluster resource usage while keeping TPE's search efficiency intact?
正解: B
SparkTrials distributes trials across Spark executors. parallelism=8 matches the worker count and is close to the square root of max_evals (100), about 10, which is a reasonable choice. Option A doesn't use the cluster's resources. Option C with parallelism=100 effectively wipes out TPE's advantage of using past results to pick the next point, making it equivalent to random search. Optuna in option D lacks SparkTrials integration on Databricks and is not the best fit.
Should I use Hyperopt or Optuna?
If integration with the Databricks ecosystem matters most, Hyperopt is the first choice. Its strengths are cluster-wide distributed tuning via SparkTrials, being the engine behind AutoML, and automatic integration with MLflow Tracking. On the other hand, Optuna fits better when you need pruning (early termination) for efficiency, search algorithms beyond TPE (such as CMA-ES), or when you want to run the same code across other clouds and on-prem. For exam prep, Hyperopt is the priority.
What is the difference between SparkTrials and Trials?
Trials is a class that runs trials sequentially on a single machine, using only the resources of one driver node. SparkTrials distributes trials across Spark executors and runs them in parallel across the cluster. For example, an 8-worker cluster can run up to 8 trials concurrently, drastically reducing the time to complete 100 trials. This distinction comes up often on the ML Professional exam.
How does hyperparameter tuning relate to AutoML?
Databricks AutoML uses Hyperopt internally to search hyperparameters. AutoML is a higher-level layer that automates preprocessing, feature engineering, model selection, and tuning, and every trial is automatically logged to MLflow. The notebooks AutoML generates contain Hyperopt code, which you can customize to build your own tuning pipeline.
Practice with certification-focused question sets
無料で問題を解いてみるNicheeLab Editorial Team
NicheeLab editorial team focused on data engineering and cloud certification learning. Content is structured around practical study needs and official exam domains.
Databricks Certifications: All 7 Exams, Difficulty & Study Plan (2026)
Complete guide to all 7 Databricks certifications — Data Eng...
Databricks Exam Difficulty Ranking: All 7 Certs Compared (2026)
Every Databricks certification ranked by difficulty, with st...
Databricks Study Guide: Fastest Pass Route & Time Estimates (2026)
How to pass Databricks certifications efficiently. Official ...
Databricks Data Engineer Associate: Complete Guide (2026)
Domain-by-domain breakdown of the Databricks Certified Data ...
Databricks Data Engineer Professional: Complete Guide (2026)
Tactics for the Databricks Certified Data Engineer Professio...