Most of the existing simultaneous localization and mapping (SLAM) methods are based on the static environment assumption. The presence of moving objects in the scene will lead to much uncertainty in SLAM results, which also hinders the loop-closure detection (LCD). Although moving object tracking (MOT) is necessary for planning and decisions, it is often accomplished separately. To jointly solve SLAM and MOT for complex urban driving scenarios, this paper presents a high performance method named 4D-SLAM based on the fusion of LiDAR and IMU. The integration of SLAM and MOT is formulated as a joint posterior probability problem based on a dynamic Bayesian network (DBN) and is implemented with the following four sequential stages: preprocessing, moving object detection and tracking, odometry estimation and mapping. In the preprocessing stage, the motion distortion uncertainty caused by LiDAR scanning is first compensated for, and the initial LiDAR motion is estimated. In the moving object detection and tracking stage, we exploit a CNN-based segmentation network to detect the potential moving objects first, then the states of the potential moving objects are optimized by an unscented Kalman filter (UKF). In the odometry estimation stage, the distinctive planar and edge features extracted from a static background point cloud are used for the odometry estimation, and a two-step Levenberg-Marquardt optimization method is adopted to solve the 6DOF pose across consecutive scans. In the mapping stage, the mapping based on the pose estimation result and LCD are realized, and graph-based global optimization is exploited to further improve the map consistency for large scale environment. The comprehensive experiments with the open source dataset KITTI and the data collected by us show that the presented method not only outperforms the SOTA SLAM methods in terms of trajectory and mapping accuracy but can also detect and track moving objects efficiently.