ceres实现的pnp解算后的位姿优化代码详解

写在前面

这篇文章作为基础文章也是本文的学习和理解的过程，在将会给出更多的注释和“废话”帮助自己理解。同时有错误的话欢迎各位朋友留言指教。

Ceres solver 是google开发的一款用于非线性优化的库，常用在SLAM中BA问题的求解，在谷歌的开源激光雷达slam项目cartographer中被大量使用。Ceres官网上的文档非常详细地介绍了其具体使用方法，相比于另外一个在slam中被广泛使用的图优化库G2O，ceres具有更丰富的API文档和官方教程！

ceres优化案例

对于任何一个优化问题，首先需要对问题进行建模，之后采用合适的优化方法，进行求解，在求解的过程中，往往需要进行梯度下降求取最优，这里涉及了导数计算，所以在代码中使用Ceres进行优化时，需要包含基本内容：建模、优化、求导三个部分。

1、建模：构建cost fuction，即代价函数，也就是寻优的目标式。这个部分需要使用仿函数（functor）这一技巧来实现，做法是定义一个cost function的结构体，在结构体内重载（）运算符，具体实现方法后续介绍。

2、优化：通过代价函数构建待求解的优化问题。

3、求导：配置求解器参数并求解问题，这个步骤就是设置方程怎么求解、求解过程是否输出等，最后调用一下Solve方法。

首先学习一下Ceres官网教程给出的例程中，求解的问题是求x使得1/2(10−x)^2取到最小值，详细解释如下：

#include<iostream>#include<ceres/ceres.h>

using namespace std;using namespace ceres;

//构建代价函数，重载（）符号struct CostFunctor { template <typename T> bool operator()(const T* const x, T* residual) const { residual[0] = T(10.0) - x[0]; return true; }};//主函数int main(int argc, char** argv) { // ceres需要设置寻优参数的初始值，这里将x设置为5 double initial_x = 5.0; double x = initial_x;

// 固定格式，构建寻优问题 Problem problem; CostFunction* cost_function = new AutoDiffCostFunction<CostFunctor, 1, 1>(new CostFunctor);

//使用自动求导，将之前的代价函数结构体传入，第一个1是输出维度，即残差的维度，第二个1是输入维度，即待寻优参数x的维度。分别对应上述结构体中的residual和x

 problem.AddResidualBlock(cost_function, NULL, &x); //向问题中添加误差项，该问题比较简单，添加一个就行。

//最后配置并运行求解器 Solver::Options options; options.linear_solver_type = ceres::DENSE_QR; //配置增量方程的解法 options.minimizer_progress_to_stdout = true;//输出到cout Solver::Summary summary;//优化信息 Solve(options, &problem, &summary);//求解!!!

std::cout << summary.BriefReport() << "\n";//输出优化的简要信息 std::cout << "x : " << initial_x << " -> " << x << "\n"; return 0;}

在这个例子中，使用的是自动求导法（AutoDiffCostFunction）， Ceres提供了三种求导方法，分别是：解析求导、数值求导与自动求导

使用ceres的关键在于构建代价函数，这里我们再学习一下官网给出的以视觉SLAM的重投影误差作为CostFunction的代码（http://ceres-solver.org/nnls_tutorial.html#bundle-adjustment）

给定一系列测得的图像，包含特征点位置和对应关系。BA的目标就是，通过最小化重投影误差，确定三维空间点的位置和相机参数。这个优化问题通常被描述为非线性最小二乘法问题，要最小化的目标函数即为观测到的特征点位置与对应的三维点在相机成像平面上投影之差的L2平方模

struct SnavelyReprojectionError { SnavelyReprojectionError(double observed_x, double observed_y) : observed_x(observed_x), observed_y(observed_y) {}

template <typename T> bool operator()(const T* const camera, const T* const point, T* residuals) const { //首先根据相机的外参将世界坐标系转换到相机坐标系下并归一化 // camera[0,1,2] are the angle-axis rotation. T p[3]; // 将世界坐标系的3D点point，转到相机坐标系下的3D点P ceres::AngleAxisRotatePoint(camera, point, p); // camera[3,4,5] are the translation. p[0] += camera[3]; p[1] += camera[4]; p[2] += camera[5];

// Compute the center of distortion. The sign change comes from // the camera model that Noah Snavely's Bundler assumes, whereby // the camera coordinate system has a negative z axis. // 计算归一化坐标 T xp = - p[0] / p[2]; T yp = - p[1] / p[2]; //其次根据相机的内参对归一化的相机坐标系进行畸变的计算和投影点计算。 // Apply second and fourth order radial distortion. // 径向畸变系数 const T& l1 = camera[7]; const T& l2 = camera[8]; T r2 = xp*xp + yp*yp; T distortion = 1.0 + r2 * (l1 + l2 * r2); // Compute final projected point position. const T& focal = camera[6]; //焦距 // 像素坐标 T predicted_x = focal * distortion * xp; T predicted_y = focal * distortion * yp; // The error is the difference between the predicted and observed position. //最后将计算出来的3D投影点坐标与观察的图像图像坐标 // 重投影误差 residuals[0] = predicted_x - T(observed_x); residuals[1] = predicted_y - T(observed_y); return true; } // Factory to hide the construction of the CostFunction object from // the client code. static ceres::CostFunction* Create(const double observed_x, const double observed_y) { return (new ceres::AutoDiffCostFunction<SnavelyReprojectionError, 2, 9, 3>( new SnavelyReprojectionError(observed_x, observed_y))); } double observed_x; double observed_y;};

上面就是有点啰嗦也是对官方给出的代码的学习过程，接下来就是我想分享的内容，比如我们在实际的项目中经常会遇到以下这几种计算：

（1）计算两帧图像的RT:两帧图像特征点进行匹配，通过Pnp计算出旋转和平移后，对其计算的结果进行优化。这在ORB_SLAM是常见。

（2）计算相机位姿，比如我知道了图像坐标系中多个特征点，并通过一些测量的手段知道了对应的特征点在世界坐标系下的坐标值，此时也是需要Pnp后进行优化处理。

所以这里给出一个常用的代码如下：

cv::Mat rvec, t, R; cv::solvePnP(match.pts_3d, match.pts_2d, K, cv::Mat(), rvec, t, false, cv::SOLVEPNP_EPNP); //Pnp 解算输出两帧之间的位姿或者是世界坐标系在相机坐标下的位姿 cv::Rodrigues(rvec, R);//对旋转向量进行罗德里格式变换生成旋转矩阵 //接下来就是ceres的优化函数 PoseOptimization(match.pts_3d, match.pts_2d, rvec, t);

首先我们看一下定义的重投影误差。代码格式和上面定义的优化相机内参的代码有类似的地方，因为大多的我们这种不是造轮子的人都是照葫芦画瓢就OK了，

比如我们在使用自动求导模板参数时，其参数顺序为: 误差类型，输出维度，输入维度，维度要与前面的struct一致。

所以我们可以看出优化内参和优化外参的区别，求解内参其定义：

new ceres::AutoDiffCostFunction<SnavelyReprojectionError, 2, 9, 3>(

new SnavelyReprojectionError(observed_x, observed_y))

2,9,3 的区别：2是输出维度（残差），9和3是输入维度（相机的9个参数和三维点坐标3个参数）

下面的这个代码函数是求解R t,最后的不同在于：

new ceres::AutoDiffCostFunction<ReprojectionError, 2, 3, 3>(

new ReprojectionError(points, observed))

2,3,3 的区别：2是输出维度（残差），3和3是输入维度，分别为旋转和平移向量

class ReprojectionError{public: ReprojectionError(Eigen::Vector3d point_, Eigen::Vector2d observed_) : point(point_), observed(observed_) { } template<typename T> bool operator()(const T* const camera_r, const T* const camera_t, T* residuals) const{ T pt1[3]; //3D点坐标系 pt1[0] = T(point.x()); pt1[1] = T(point.y()); pt1[2] = T(point.z());

T pt2[3];//对3D点进行坐标转换，换算到相机坐标系下 ceres::AngleAxisRotatePoint(camera_r, pt1, pt2);

pt2[0] = pt2[0] + camera_t[0]; pt2[1] = pt2[1] + camera_t[1]; pt2[2] = pt2[2] + camera_t[2]; //归一化并将其转换到图像坐标系下 const T xp = T(K[0] * (pt2[0] / pt2[2]) + K[2]); const T yp = T(K[1] * (pt2[1] / pt2[2]) + K[3]); const T u = T(observed.x()); const T v = T(observed.y()); //投影点与图像点的误差 residuals[0] = u - xp; residuals[1] = v - yp; return true; } static ceres::CostFunction* Create(Eigen::Vector3d points, Eigen::Vector2d observed){ return (new ceres::AutoDiffCostFunction<ReprojectionError, 2, 3, 3>( new ReprojectionError(points, observed))); }

private: Eigen::Vector3d point; Eigen::Vector2d observed; // Camera intrinsics double K[4] = {520.9, 521.0, 325.1, 249.7}; // fx,fy,cx,cy};

最后我们在程序中使用PoseOptimization进行优化

void PoseOptimization(const std::vector<cv::Point3d>& points_3d, const std::vector<cv::Point2d>& points_2d, cv::Mat &rvec, cv::Mat &t){ // Attention: cv::Mat::type assert(rvec.type() == CV_64F); assert(t.type() == CV_64F);

double camera_rvec[3]; camera_rvec[0] = rvec.at<double>(0,0); camera_rvec[1] = rvec.at<double>(1,0); camera_rvec[2] = rvec.at<double>(2,0); double camera_t[3]; camera_t[0] = t.at<double>(0,0); camera_t[1] = t.at<double>(1,0); camera_t[2] = t.at<double>(2,0);

ceres::Problem problem; ceres::LossFunction* lossfunction = NULL; for(uint i = 0; i < points_3d.size(); i++)//将3D点和对应的图像点加入到代码函数中 { Eigen::Vector3d p3d(points_3d[i].x, points_3d[i].y, points_3d[i].z); Eigen::Vector2d p2d(points_2d[i].x, points_2d[i].y); ceres::CostFunction* costfunction = ReprojectionError::Create(p3d, p2d);

 problem.AddResidualBlock(costfunction, lossfunction, camera_rvec, camera_t);//这一步也是我们照葫芦画瓢的关键，输入正确的参数

}

//设置优化器参数及优化方法 ceres::Solver::Options options; options.linear_solver_type = ceres::DENSE_SCHUR; options.max_num_iterations = 100; options.trust_region_strategy_type = ceres::LEVENBERG_MARQUARDT; options.minimizer_progress_to_stdout = true;

//调用求解器进行优化 ceres::Solver::Summary summary; ceres::Solve(options, &problem, &summary); std::cout << summary.BriefReport() << std::endl;

std::cout << "After Optimizing: " << std::endl; double quat[4]; ceres::AngleAxisToQuaternion(camera_rvec, quat); Eigen::Quaterniond q(quat[0], quat[1], quat[2], quat[3]); //对优化后的rt进行赋值还原 rvec.at<double>(0,0) = camera_rvec[0]; rvec.at<double>(1,0) = camera_rvec[1]; rvec.at<double>(2,0) = camera_rvec[2]; t.at<double>(0,0) = camera_t[0]; t.at<double>(1,0) = camera_t[1]; t.at<double>(2,0) = camera_t[2];}

总结

这里我们就对以上两种情况进行对比分析发现，优化内参和优化外参的在主函数调用时候的区别。

优化内参的函数：

SnavelyReprojection::Create(observations[2 * i + 0], observations[2 * i + 1])；problem.AddResidualBlock(cost_function, loss_function, camera, point);

Create中为输出的函数已知量，这里就为图像坐标系上的角点。

AddResidualBlock 后两个参数为待优化的参数，分别为相机内参合3D点。

优化外参的函数：

ceres::CostFunction* costfunction = ReprojectionError::Create(p3d, p2d);problem.AddResidualBlock(costfunction, lossfunction, camera_rvec, camera_t);

Create中为输出的函数已知量，这里为空间三维点和对应的图像坐标系上的点。

AddResidualBlock 后两个参数为待优化的参数，分别为相机对应的旋转和平移量。

资源

三维点云论文及相关应用分享

【点云论文速读】基于激光雷达的里程计及3D点云地图中的定位方法

3D目标检测：MV3D-Net

三维点云分割综述（上）

3D-MiniNet: 从点云中学习2D表示以实现快速有效的3D LIDAR语义分割（2020）

win下使用QT添加VTK插件实现点云可视化GUI

JSNet：3D点云的联合实例和语义分割

大场景三维点云的语义分割综述

PCL中outofcore模块---基于核外八叉树的大规模点云的显示