The comment on “Three-dimensional potential field data inversion with L0 quasi-norm sparse constraints” by Vatankhah mainly gives some advice on my paper. Vatankhah pays attention on the L0 norm, which is applied on the potential field data inversion. Last and Kubik (1983) firstly introduced the compact inversion of gravity data. Later, other inversion studies were based on this principle (e.g.: Portniaguine and Zhdanov 1999). The strategy to obtain compact models is similar to the L0 norm constraint. However, they are not same. In my paper, I used the L0 norm which was firstly introduced in the compressive sensing theory. Compressive sensing is the reconstruction of sparse images or signals from very few samples, by means of solving a tractable optimization problem (Baraniuk 2008; Chartrand 2009). It is a great breakthrough. Later, this principle has been applied in many domains (e.g., Mohimani, Babaie-Zadeh and Jutten 2009; Zhang and Tian 2017). In these researches, an approximate zero norm replaces the L0 norm. In Vatankhah’s comment, there may be some inaccurate understanding. Function f 2 σ (m) (equation (3) in Vatankhah’s comment, corresponding to the function used in Meng (2018)) is better suited for potential field data inversion comparing with f 1 σ (m) (equation (1) in Vatankhah’s comment, introduced by Last and Kubik, 1983). In Vatankhah’s Figure 1(a), for large values of σ and for big absolute values of m, f 2 σ (m) has smaller values than f 1 σ (m), and therefore f 2 σ (m) is smoother than f 1 σ (m). For large σ more weight is imposed on the large elements of the parameter vector m in f 1 σ (m), as compared with f 2 σ (m), yielding a larger penalty on these large elements of m. Thus, the f 2 σ (m) can provide smoother solutions. This phenomenon demonstrates that noise has less effect on f 2 σ (m) than on f 1 σ (m), both used to approximate the L0 norm. Therefore, the f 2 σ (m) function has better performance with big values of σ. Even when σ = 0.1 (Vatankhah’s Fig. 1(b)), f 2 σ (m) have smaller values than f 1 σ (m) for absolute values of m lesser than 0.2, and bigger values than f 1 σ (m) for absolute values ofmbetween 0.2 and 0.6. Therefore, equation (3) can give sparser and better inversion results. Vatankhah’s Figure 1 shows this phenomenon clearly. In the potential field data inversion, the depth-weighting function and density constraint function are very important and commonly used. I think that the researchers can focus on the solution on the L0 norm. For example, the Newton gradient method is applied in my paper, which can effectively avoid the calculations of step size. I think that this reply can help the readers to have a better understanding of this topic.