Fixed bug in from_checkpoint.py for recurrent PG/PPO models
This bug was linked to the use of compute_single_action, where the seq_lens and the state parameters were empty. This bugged out the script, preventing us from simulating learned policies using from_checkpoint.py. This has since been fixed. The QMIX LSTM model does not apparently suffer from this bug, therefore it is untouched for now.
Merge request reports
Activity
Filter activity
Please register or sign in to reply