Hopes: Estimators
=================

Roadmap
-------

- [x] Implement Inverse Probability Weighting (IPW) estimator
- [x] Implement Self-Normalized Inverse Probability Weighting (SNIPW) estimator
- [x] Implement Direct Method (DM) estimator
- [x] Implement Trajectory-wise Importance Sampling (TIS) estimator
- [x] Implement Self-Normalized Trajectory-wise Importance Sampling (SNTIS) estimator
- [x] Implement Per-Decision Importance Sampling (PDIS) estimator
- [x] Implement Self-Normalized Per-Decision Importance Sampling (SNPDIS) estimator
- [ ] Implement Doubly Robust (DR) estimator

Implemented estimators
-----------------------

Currently, the following estimators are implemented:

.. autosummary::
   :nosignatures:

   hopes.ope.estimators.BaseEstimator
   hopes.ope.estimators.InverseProbabilityWeighting
   hopes.ope.estimators.SelfNormalizedInverseProbabilityWeighting
   hopes.ope.estimators.DirectMethod
   hopes.ope.estimators.TrajectoryWiseImportanceSampling
   hopes.ope.estimators.SelfNormalizedTrajectoryWiseImportanceSampling
   hopes.ope.estimators.PerDecisionImportanceSampling
   hopes.ope.estimators.SelfNormalizedPerDecisionImportanceSampling

Estimators documentation
------------------------

.. autoclass:: hopes.ope.estimators.InverseProbabilityWeighting
    :members:
    :undoc-members:
    :show-inheritance:

.. autoclass:: hopes.ope.estimators.SelfNormalizedInverseProbabilityWeighting
    :members:
    :undoc-members:
    :show-inheritance:

.. autoclass:: hopes.ope.estimators.DirectMethod
    :members:
    :undoc-members:
    :show-inheritance:

.. autoclass:: hopes.ope.estimators.TrajectoryWiseImportanceSampling
    :members:
    :undoc-members:
    :show-inheritance:

.. autoclass:: hopes.ope.estimators.SelfNormalizedTrajectoryWiseImportanceSampling
    :members:
    :undoc-members:
    :show-inheritance:

.. autoclass:: hopes.ope.estimators.PerDecisionImportanceSampling
    :members:
    :undoc-members:
    :show-inheritance:

.. autoclass:: hopes.ope.estimators.SelfNormalizedPerDecisionImportanceSampling
    :members:
    :undoc-members:
    :show-inheritance:

Implementing a new estimator
----------------------------

To implement a new estimator, you need to subclass :class:`hopes.ope.estimators.BaseEstimator` and implement:

- :meth:`hopes.ope.estimators.BaseEstimator.estimate_weighted_rewards`. It should return the estimated weighted rewards.
- :meth:`hopes.ope.estimators.BaseEstimator.estimate_policy_value`. It should return the estimated value of the target policy. It typically uses the estimated weighted rewards.

Optionally, you can implement :meth:`hopes.ope.estimators.BaseEstimator.short_name` to provide a short name for the estimator.
When not implemented, the uppercase letters of the class name are used.

Below is the `BaseEstimator` class documentation.

.. autoclass:: hopes.ope.estimators.BaseEstimator
    :members: estimate_weighted_rewards, estimate_policy_value, short_name