schola.scripts.ray.settings.IMPALASettings

class schola.scripts.ray.settings.IMPALASettings(vtrace=True, vtrace_clip_rho_threshold=1.0, vtrace_clip_pg_rho_threshold=1.0) : Bases: RLLibAlgorithmSpecificSettings

IMPALA (Importance Weighted Actor-Learner Architecture) 算法特定设置的数据类。此类定义了 IMPALA 算法中使用的参数，包括用于离策略校正的 V-trace 设置。

方法


`__init__`([vtrace, …])
`get_parser`()	将设置添加到解析器或子解析器
`get_settings_dict`()	以 Ray 中正确的参数名称作为键，将设置获取为字典

属性


`name`
`rllib_config`
`vtrace`	是否在 IMPALA 算法中使用 V-trace 算法进行离策略校正。
`vtrace_clip_pg_rho_threshold`	策略梯度中 V-trace rho 值的裁剪阈值。
`vtrace_clip_rho_threshold`	V-trace rho 值的裁剪阈值。

参数： : - vtrace (bool)

vtrace_clip_rho_threshold (float)
vtrace_clip_pg_rho_threshold (float)

__init__(vtrace=True, vtrace_clip_rho_threshold=1.0, vtrace_clip_pg_rho_threshold=1.0) : 参数： : - vtrace (bool)

vtrace_clip_rho_threshold (float)
vtrace_clip_pg_rho_threshold (float)

返回类型: : None

classmethod get_parser() : 将设置添加到解析器或子解析器

get_settings_dict() : 以 Ray 中的正确参数名称为键获取设置的字典

property name*: str*

property rllib_config*: Type[IMPALAConfig]*

vtrace*: bool* = True : 是否在 IMPALA 算法中使用 V-trace 算法进行离策略校正。V-trace 是一种用于校正使用离策略数据进行训练而引入的偏差的方法。它有助于确保值估计更准确和稳定。

vtrace_clip_pg_rho_threshold*: float* = 1.0 : 策略梯度中 V-trace rho 值的剪辑阈值。

vtrace_clip_rho_threshold*: float* = 1.0 : V-trace rho 值的剪辑阈值。