Search

Your search keyword '"Lazaric, Alessandro"' showing total 319 results

Search Constraints

Start Over You searched for: Author "Lazaric, Alessandro" Remove constraint Author: "Lazaric, Alessandro"
319 results on '"Lazaric, Alessandro"'

Search Results

52. Rotting bandits are not harder than stochastic ones

53. Near Optimal Exploration-Exploitation in Non-Communicating Markov Decision Processes

54. Distributed Adaptive Sampling for Kernel Matrix Approximation

55. Efficient Bias-Span-Constrained Exploration-Exploitation in Reinforcement Learning

56. Second-Order Kernel Online Convex Optimization with Adaptive Sketching

57. Experimental results : Reinforcement Learning of POMDPs using Spectral Methods

58. Thompson Sampling for Linear-Quadratic Control Problems

59. Exploration--Exploitation in MDPs with Options

60. Active Learning for Accurate Estimation of Linear Models

61. Linear Thompson Sampling Revisited

62. Reinforcement Learning in Rich-Observation MDPs using Spectral Methods

63. Analysis of Kelner and Levin graph sparsification algorithm for a streaming setting

64. Open Problem: Approximate Planning of POMDPs in the class of Memoryless Policies

65. Reinforcement Learning of POMDPs using Spectral Methods

66. Incremental Spectral Sparsification for Large-Scale Graph-Based Semi-Supervised Learning

67. Upper-Confidence-Bound Algorithms for Active Learning in Multi-Armed Bandits

68. Best-Arm Identification in Linear Bandits

69. Truthful Learning Mechanisms for Multi-Slot Sponsored Search Auctions with Externalities

70. Online Stochastic Optimization under Correlated Bandit Feedback

71. Sequential Transfer in Multi-armed Bandit with Finite Set of Models

72. Regret Bounds for Reinforcement Learning with Policy Advice

73. Risk-Aversion in Multi-armed Bandits

74. A Dantzig Selector Approach to Temporal Difference Learning

75. Transfer from Multiple MDPs

79. Regularized Least Squares Temporal Difference Learning with Nested ℓ2 and ℓ1 Penalization

80. Upper-Confidence-Bound Algorithms for Active Learning in Multi-armed Bandits

85. Incremental Skill Acquisition for Self-motivated Learning Animats

89. Sample complexity bounds for stochastic shortest path with a generative model

90. SKETCHED NEWTON RAPHSON.

91. No-regret exploration in goal-oriented reinforcement learning

92. Reward-free exploration beyond finite-horizon

97. Rotting bandits are not harder than stochastic ones

98. Fighting Boredom in Recommender Systems with Linear Reinforcement Learning

99. Improved Regret Bounds for Thompson Sampling in Linear Quadratic Control Problems

100. Improved large-scale graph learning through ridge spectral sparsification

Catalog

Books, media, physical & digital resources