ReLeaPS: Reinforcement Learning-based Illumination Planning for Generalized Photometric Stereo
ICCV 2023

  • 1National Key Laboratory for Multimedia Information Processing, School of Computer Science, Peking University
  • 2National Engineering Research Center of Visual Technology, School of Computer Science, Peking University
  • 3School of Artificial Intelligence, Beijing University of Posts and Telecommunications
  • 4Osaka University
  • 5School of Mechanical Engineering, Shanghai Jiao Tong University


Illumination planning in photometric stereo aims to find a balance between surface normal estimation accuracy and image capturing efficiency by selecting optimal light configurations. It depends on factors such as the unknown shape and general reflectance of the target object, global illumination, and the choice of photometric stereo backbones, which are too complex to be handled by existing methods based on handcrafted illumination planning rules. This paper proposes a learning-based illumination planning method that jointly considers these factors via integrating a neural network and a generalized image formation model. As it is impractical to supervise illumination planning due to the enormous search space for ground truth light configurations, we formulate illumination planning using reinforcement learning, which explores the light space in a photometric stereo-aware, and reward-driven manner. Experiments on synthetic and real-world datasets demonstrate that photometric stereo under the 20-light configurations from our method is comparable to, or even surpasses that of using lights from all available directions.


  • Proposing the first RL approach for online illumination planning in a reward-driven manner;
  • Designing a dueling DQN specially tailored to generalized photometric stereo;
  • Enhancing the performance of different photometric stereo backbones with a smaller number of inputs by appropriate illumination planning; and
  • Evaluating RL-based illumination planning by building a real data validation setup.


The illumination planning pipeline of ReLeaPS:

  • The agent observes the state and selects action from Q-values. (Red block)
  • The environment captures a new image based on action to form a new state. (Blue block)
  • The angular error between the predicted and ground truth normals is calculated to form the reward. Then the network is updated based on reward. The process is a repetitive cycle with the above elements until the episode is terminated. (Green block)


Results-Quantitative Comparison

Quantitative comparisons of different illumination planning approaches in terms of a mean angular error on Blobby, Sculpture, DiLiGenT, and DiLiGenT10^2 datasets using different photometric stereo backbones with 20 lights. `Rnd.' stands for the random selection of light directions averaged over 10 evaluations.

Results-Quantitative Evaluation

Quantitative evaluation of illumination planning methods w.r.t. to the different number of light directions on real-world benchmarks: DiLiGenT (left) and DiLiGenT10^2 (right). The mean angular error is averaged among LS, CNN-PS, and PS-FCN backbones.

Results-Qualitative Comparison

Qualitative comparison of recovered surface normals and error maps for (left) Reading (from DiLiGenT), and (right) LionHead (captured using our setup) using different illumination planning methods (i.e., DC05, TK22, and Ours) with increasing light directions (3, 7, 11, 15, and 20 lights) in CNN-PS backbone. The red circle indicates a region with shadows that cannot be effectively recovered by CNN-PS, resulting in a large angular error.


    author = {Chan, Junhoong and Yu, Bohan and Guo, Heng and Ren, Jieji and Lu, Zongqing and Shi, Boxin},
    title = {{ReLeaPS}: Reinforcement Learning-based Illumination Planning for Generalized Photometric Stereo},
    booktitle = {Proceedings of the International Conference on Computer Vision (ICCV)},
    month = {October},
    year = {2023},


Any questions and further discussion, please send e-mail to junhoong95_AT_stu_DOT_pku_DOT_edu_DOT_cn.


This work is supported by the National Natural Science Foundation of China under Grant No. 62136001, 62088102. Heng Guo was supported by JSPS KAKENHI (Grant No. JP23H05491).