WORKING PAPER

Online Action Learning in High Dimensions: A New Exploration Rule for Contextual et-Greedy Heuristics

2020

Claudio Cardoso Flores, Marcelo Medeiros.

TD n. 674

Bandit problems are pervasive in various fields of research and are also present in several practical applications. Examples, including dynamic pricing and assortment and the design of auctions and incentives, permeate a large number of sequential treat- ment experiments. Different applications impose distinct levels of restrictions on viable actions. Some favor diversity of outcomes, while others require harmful actions to be closely monitored or mainly avoided. In this paper, we extend one of the most popular bandit solutions, the original et-greedy heuristics, to high-dimensional contexts. Moreover, we introduce a competing exploration mechanism that counts with searching sets based on order statistics. We view our proposals as alternatives for cases where pluralism is valued or, in the opposite direction, cases where the end-user should carefully tune the range of exploration of new actions. We find reasonable bounds for the cumulative regret of a decaying et-greedy heuristic in both cases, and we provide an upper bound for the initialization phase that implies the regret bounds when order statistics are considered to be at most equal but mostly better than the case when random searching is the sole exploration mechanism. Additionally, we show that endusers have sufficient exibility to avoid harmful actions since any cardinality for the higher-order statistics can be used to achieve stricter upper bound. In a simulation exercise, we show that the algorithms proposed in this paper outperform simple and adapted counterparts.

A política comercial dos EUA e o Brasil, 1870s-2025

10/11/2025

Marcelo de Paiva Abreu.

O Brasil em meio ao aquecimento global

22/07/2025

Marcelo de Paiva Abreu.

Escravidão, imigração e emigração no Brasil independente

21/07/2025

Marcelo de Paiva Abreu.

Online Action Learning in High Dimensions: A New Exploration Rule for Contextual et-Greedy Heuristics

See also

A política comercial dos EUA e o Brasil, 1870s-2025

O Brasil em meio ao aquecimento global

Escravidão, imigração e emigração no Brasil independente