Issue
Context
I am trying to use a pre-trained model in ONNX format to do inference on image data in Unity. The model is linked to the executing component in Unity as an asset called modelAsset. I am using Barracuda version 1.0.0 for this and executing the model as follows:
// Initialisation
this.model = ModelLoader.Load(this.modelAsset);
this.worker = WorkerFactory.CreateWorker(WorkerFactory.Type.CSharpBurst, model);
// Loop
Tensor tensor = new Tensor(1, IMAGE_H, IMAGE_W, 3, data);
worker.Execute(tensor);
Tensor modelOutput = worker.PeekOutput(OUTPUT_NAME);
The data going into the input tensor (of which the model has only 1) is image data of h * w with 3 channels for RGB values between -0.5 and 0.5. The model has multiple outputs which I retrieve in the last line shown above.
Expected behavior
Using the same input data, the PyTorch model and converted ONNX model produce the same output data in Python (ONNXRuntime and PyTorch) as in Barracuda in Unity.
Problem
In python both the ONNX and PyTorch model produce the same output. However, the same ONNX model running in Barracuda produces a different output. The difference is mainly that we expect a heatmap but Barracuda consistently produces values somewhere between 0.001 and -0.0004 in these patterns:
This makes it almost seem like the model weights are not properly loaded.
What we found
When converting to ONNX as per the Barracuda manual we found that if we did not set the model to inference mode in the PyTorch net before conversion (link), these same, incorrect, results were generated by ONNXRuntime in Python. In other words, it looks like this inference mode is saved in the ONNX model and is recognized by ONNXRuntime in Python but not in Barracuda.
Our question
In general:
- How do we get this model in Barracuda in Unity to produce the same results as ONNXRuntime/PyTorch in Python?
And potentially:
- How does the inference mode get embedded into the ONNX file and how is it used in ONNXRuntime vs Barracuda?
Solution
So it turned out that there were 2 problems. First, the input data had been orchestrated according to the ONNX model dimensions, however, Barracuda expects differently oriented data. "The native ONNX data layout is NCHW, or channels-first. Barracuda automatically converts ONNX models to NHWC layout." So our data was flattened into an array similar to the Python implementation which created the first mismatch.
Secondly, the Y-axis of the input image was inverted, making the model unable to recognize any people.
After correcting for these issues, the implementation works fine!
Answered By - Rens van der Veldt
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.