In Fig. \ref{598976}, we show the maximum, median, and minimum number of steps required per fetch over the 100 runs. Interestingly, the median, similar to the average, of a posteriori converges toward the median of a priori. The maximum, on the other hand, does not get smaller over time. The reason for this could be that the world is not sufficiently explored to calculate reliable plans for all locations after only 100 fetches. With roughly 100 cells where an item could be placed, this seems plausible. Similarly, there is a high chance that an item location is close to the start when considering 100 runs. Having an item close to the start would explain the very stable minimum for both a posteriori and a priori.
In Tab.~\ref{tab:res_avg_runs}, we compared average total steps, total plans, and total runtime summed over a whole run, i.e., 100 fetches, for the 11x11 grid a priori and a posteriori configuration. For the configuration a priori - 0 agents, the intelligent agent only needs 100 plans. That means no replanning was necessary. This was expected as everything about the world is known. As the number of agents increases, the number of replans increases. More replans are needed because there is a higher chance for the intelligent agent to collide with another agent. For the a posteriori setup, we see a similar increase in total plans and steps between 0 agents and 4 agents. But, the difference between the a priori and a posteriori is significant. One explanation for the difference could be that, for a priori, the information about the shelves is known, and the intelligent agent only collides with agents. In contrast, for a posteriori the intelligent agent, in the beginning, collides mostly with shelves. Because of lack of knowledge, the agent generates a plan that bypasses the shelve by just one block. Mostly, a shelf is right next to another shelf. This, in turn, leads to another collision with only a single step taken. The average steps per plan also support this explanation. For a priori, this is, on average, 15 steps per plan, and for a posteriori, this is, on average, only two steps per plan.
From Tab.~\ref{tab:res_avg_runs} we can see that the runtime mostly correlates to the number of plans needed. Although we implemented some improvements for the RBL planning algorithm, it is still by far the largest runtime bottleneck. Only a fraction of the planning time is used for the SFL reliability calculation, which only increases with the number of possible actions for which the reliability should be calculated, as we proved in a previous section. Combining the SFL approach with a better planning algorithm would for sure result in better runtime.