I don’t see any obvious issues there. When the ESP is flashing the AVR the AVR won’t be running any code, and all the SPI /CS signals should be pulled high by external pull-ups, so they won’t be listening or interfering. Once the ESP has finished flashing it will need to do the equivalent of unplugging itself from the ICSP header, presumably by gently pulling high all the signals running over to the AVR. So you don’t really have two SPI masters, you really only have one (the AVR), and you also have an ISP programmer, which could be an AVR-ISPII or it could be an ESP running some software.
I’ve got 8 devices on the SPI bus in that energy monitor pictured above, and I don’t have any problem programming it with the AVR-ISPII via the ICSP header. The AVR half of your design shouldn’t really know or care what you’re “attaching” to the ICSP header. An AVR-ISPII or a hardwired ESP should look identical (provided it disappears once flashed).
You’ll need to take a bit of care with the /CSs if you plan having more than one device on the SPI bus. Do you intend using the RF module as well as the input mux? You’ll probably find that both the RF library and the MCP23S17 library assume exclusive use of D10 (on an Uno) for the /CS controlling their device.