Issue
For the first time, it is proceeding mlflow with port 5000.
Testing Mlflow, problem is no attribute last_active_run in mlflow
But, It was an example provided by Mlflow.
link is here mlflow
What is problem and how can I change code?
shell
wget https://raw.githubusercontent.com/mlflow/mlflow/master/examples/sklearn_autolog/utils.py
wget https://raw.githubusercontent.com/mlflow/mlflow/master/examples/sklearn_autolog/pipeline.py
pipeline.py
from pprint import pprint
import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
import mlflow
from utils import fetch_logged_data
def main():
# enable autologging
mlflow.sklearn.autolog()
# prepare training data
X = np.array([[1, 1], [1, 2], [2, 2], [2, 3]])
y = np.dot(X, np.array([1, 2])) + 3
# train a model
pipe = Pipeline([("scaler", StandardScaler()), ("lr", LinearRegression())])
pipe.fit(X, y)
run_id = mlflow.last_active_run().info.run_id
print("Logged data and model in run: {}".format(run_id))
# show logged data
for key, data in fetch_logged_data(run_id).items():
print("\n---------- logged {} ----------".format(key))
pprint(data)
if __name__ == "__main__":
main()
utils.py
import mlflow
from mlflow.tracking import MlflowClient
def yield_artifacts(run_id, path=None):
"""Yield all artifacts in the specified run"""
client = MlflowClient()
for item in client.list_artifacts(run_id, path):
if item.is_dir:
yield from yield_artifacts(run_id, item.path)
else:
yield item.path
def fetch_logged_data(run_id):
"""Fetch params, metrics, tags, and artifacts in the specified run"""
client = MlflowClient()
data = client.get_run(run_id).data
# Exclude system tags: https://www.mlflow.org/docs/latest/tracking.html#system-tags
tags = {k: v for k, v in data.tags.items() if not k.startswith("mlflow.")}
artifacts = list(yield_artifacts(run_id))
return {
"params": data.params,
"metrics": data.metrics,
"tags": tags,
"artifacts": artifacts,
}
Error message
INFO mlflow.utils.autologging_utils: Created MLflow autologging run with ID '8cc3f4e03b4e417b95a64f1a9a41be63', which will track hyperparameters, performance metrics, model artifacts, and lineage information for the current sklearn workflow
Traceback (most recent call last):
File "/Users/taein/Desktop/mlflow/pipeline.py", line 33, in <module>
main()
File "/Users/taein/Desktop/mlflow/pipeline.py", line 23, in main
run_id = mlflow.last_active_run().info.run_id
AttributeError: module 'mlflow' has no attribute 'last_active_run'
Thanks for your helping
Solution
It's because of the mlflow version that you mentioned in the comments. mlflow.last_active_run()
API was introduced in mlflow 1.25.0
. So you should upgrade the mlflow or you can use the previous version of the code available here.
wget https://raw.githubusercontent.com/mlflow/mlflow/5e2cb3baef544b00a972dff9dd6fb764be20510b/examples/sklearn_autolog/utils.py
wget https://raw.githubusercontent.com/mlflow/mlflow/5e2cb3baef544b00a972dff9dd6fb764be20510b/examples/sklearn_autolog/pipeline.py
Answered By - Matin Zivdar
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.