I am researching different neighbourhoods to identify which will be appropriate for a multi processor implementation on multiple distributed FPGA's. Currently we have 8 custom boards with 4 Xilinx Spartan 3 FPGA's as the hardware implementation. Each cell in the grid represents a single functional unit microprocessor.
We have undertaken initial testing using the method that you describe and we are also using both the Moore and von Neumann neighbourhoods. I have just completed the testing of these for horizontal rectangle grids and my supervisors have suggested that we need to undertake the same testing on vertical rectangles using the same grid sizes and shapes. This is to allow us to make a judgement on the outcomes.
Whilst in this case we are using a single agent we also use multiple agents. Either when an agent becomes blocked and can go no further, the others are stopped as well or allowed to continue. Additionally there can be a time constraint which allows the agent to die and clear that cell.