Skip to content

Commit 9d646e6

Browse files
yuvaltassacopybara-github
authored andcommitted
Extend rangefinder sensor to support cameras.
PiperOrigin-RevId: 848255889 Change-Id: I6e1a9ed13d29d2242558625a47e24e47ad6e87db
1 parent 4a87199 commit 9d646e6

File tree

17 files changed

+384
-40
lines changed

17 files changed

+384
-40
lines changed

doc/XMLreference.rst

Lines changed: 20 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -6373,12 +6373,20 @@ site frame. The output is a 3D vector.
63736373
:el-prefix:`sensor/` |-| **rangefinder** (*)
63746374
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
63756375

6376-
This element creates a rangefinder. It measures the distance to the nearest geom surface, along the ray defined by the
6377-
positive Z-axis of the sensor site. If the ray does not intersect any geom surface, the sensor output is -1. If the
6378-
origin of the ray is inside a geom, the surface is still sensed (but not the inner volume). Geoms attached to the same
6379-
body as the sensor site are excluded. Invisible geoms, defined as geoms whose rgba (or whose material rgba) has alpha=0,
6380-
are also excluded. Note however that geoms made invisible in the visualizer by disabling their geom group are not
6381-
excluded; this is because sensor calculations are independent of the visualizer.
6376+
This element creates a rangefinder.
6377+
6378+
- If associated with a :ref:`site<sensor-rangefinder-site>`, it measures the distance to the nearest geom surface, along
6379+
the ray defined by the positive Z-axis of the site.
6380+
- If associated with a :ref:`camera<sensor-rangefinder-camera>`, it outputs one distance measurement for each pixel in
6381+
the camera image. Note that cameras face the :ref:`negative Z-axis<body-camera>` of their frame. The number of
6382+
measurements in this case is equal to product of the camera's width and height
6383+
:ref:`resolutions<body-camera-resolution>`.
6384+
6385+
If a ray does not intersect any geom surface, the sensor output is -1. If the origin of the ray is inside a geom, the
6386+
surface is still detected. Geoms attached to the same body as the sensor site/camera are excluded. Invisible geoms,
6387+
defined as geoms whose rgba (or whose material rgba) has alpha=0, are also excluded. Note however that geoms made
6388+
invisible in the visualizer by disabling their geom group are not excluded; this is because sensor calculations are
6389+
independent of the visualizer.
63826390

63836391
.. _sensor-rangefinder-name:
63846392

@@ -6393,9 +6401,14 @@ excluded; this is because sensor calculations are independent of the visualizer.
63936401

63946402
.. _sensor-rangefinder-site:
63956403

6396-
:at:`site`: :at-val:`string, required`
6404+
:at:`site`: :at-val:`string, optional`
63976405
The site where the sensor is attached.
63986406

6407+
.. _sensor-rangefinder-camera:
6408+
6409+
:at:`camera`: :at-val:`string, optional`
6410+
The camera where the sensor is attached.
6411+
63996412
.. _sensor-camprojection:
64006413

64016414
:el-prefix:`sensor/` |-| **camprojection** (*)

doc/XMLschema.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -940,9 +940,9 @@
940940
| :ref:`rangefinder | \* | :class: mjcf-attributes |
941941
| <sensor-rangefinder>` | | |
942942
| | | +-----------------------------------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+ |
943-
| | | | :ref:`name<sensor-rangefinder-name>` | :ref:`site<sensor-rangefinder-site>` | :ref:`cutoff<sensor-rangefinder-cutoff>` | :ref:`noise<sensor-rangefinder-noise>` | |
943+
| | | | :ref:`name<sensor-rangefinder-name>` | :ref:`site<sensor-rangefinder-site>` | :ref:`camera<sensor-rangefinder-camera>` | :ref:`cutoff<sensor-rangefinder-cutoff>` | |
944944
| | | +-----------------------------------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+ |
945-
| | | | :ref:`user<sensor-rangefinder-user>` | | | | |
945+
| | | | :ref:`noise<sensor-rangefinder-noise>` | :ref:`user<sensor-rangefinder-user>` | | | |
946946
| | | +-----------------------------------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+-----------------------------------------------------------------+ |
947947
+------------------------------------+----+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
948948
| |_| sensor |br| |_| |L| | | .. table:: |

doc/changelog.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,8 @@ General
2020
^^^^^^^
2121
- Camera frustum visualization is now triggered by setting :ref:`resolution<body-camera-resolution>` to values larger
2222
than 1. Relatedly, frustum visualization also works for :ref:`orthographic<body-camera-projection>` cameras.
23+
- Rangefinder sensors can now be attached to a camera using the :ref:`ragefinder/camera<sensor-rangefinder-camera>`
24+
attribute. In this case, the sensor will cast multiple ray, one for each camera pixel.
2325
- Non-breaking ABI changes:
2426

2527
- The type of the ``sig`` (signature) argument of :ref:`mj_stateSize` and related functions has been changed from

doc/includes/references.h

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -678,7 +678,6 @@ typedef enum mjtObj_ { // type of MujoCo object
678678
mjOBJ_FRAME = 100, // frame
679679
mjOBJ_DEFAULT, // default
680680
mjOBJ_MODEL // entire model
681-
682681
} mjtObj;
683682
typedef enum mjtSensor_ { // type of sensor
684683
// common robotic sensors, attached to a site
@@ -689,7 +688,7 @@ typedef enum mjtSensor_ { // type of sensor
689688
mjSENS_FORCE, // 3D force between site's body and its parent body
690689
mjSENS_TORQUE, // 3D torque between site's body and its parent body
691690
mjSENS_MAGNETOMETER, // 3D magnetometer
692-
mjSENS_RANGEFINDER, // scalar distance to nearest geom or site along z-axis
691+
mjSENS_RANGEFINDER, // scalar distance to nearest geom along z-axis
693692
mjSENS_CAMPROJECTION, // pixel coordinates of a site in the camera image
694693

695694
// sensors related to scalar joints, tendons, actuators

include/mujoco/mjmodel.h

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -296,7 +296,6 @@ typedef enum mjtObj_ { // type of MujoCo object
296296
mjOBJ_FRAME = 100, // frame
297297
mjOBJ_DEFAULT, // default
298298
mjOBJ_MODEL // entire model
299-
300299
} mjtObj;
301300

302301

@@ -309,7 +308,7 @@ typedef enum mjtSensor_ { // type of sensor
309308
mjSENS_FORCE, // 3D force between site's body and its parent body
310309
mjSENS_TORQUE, // 3D torque between site's body and its parent body
311310
mjSENS_MAGNETOMETER, // 3D magnetometer
312-
mjSENS_RANGEFINDER, // scalar distance to nearest geom or site along z-axis
311+
mjSENS_RANGEFINDER, // scalar distance to nearest geom along z-axis
313312
mjSENS_CAMPROJECTION, // pixel coordinates of a site in the camera image
314313

315314
// sensors related to scalar joints, tendons, actuators

src/engine/engine_sensor.c

Lines changed: 56 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -435,11 +435,62 @@ void mj_sensorPos(const mjModel* m, mjData* d) {
435435
break;
436436

437437
case mjSENS_RANGEFINDER: // rangefinder
438-
rvec[0] = d->site_xmat[9*objid+2];
439-
rvec[1] = d->site_xmat[9*objid+5];
440-
rvec[2] = d->site_xmat[9*objid+8];
441-
d->sensordata[adr] = mj_ray(m, d, d->site_xpos+3*objid, rvec, NULL, 1,
442-
m->site_bodyid[objid], NULL);
438+
if (objtype == mjOBJ_SITE) {
439+
rvec[0] = d->site_xmat[9*objid+2];
440+
rvec[1] = d->site_xmat[9*objid+5];
441+
rvec[2] = d->site_xmat[9*objid+8];
442+
d->sensordata[adr] = mj_ray(m, d, d->site_xpos+3*objid, rvec, NULL, 1,
443+
m->site_bodyid[objid], NULL);
444+
} else {
445+
// camera-attached rangefinder: depth image
446+
const int width = m->cam_resolution[2*objid];
447+
const int height = m->cam_resolution[2*objid+1];
448+
const int bodyexclude = m->cam_bodyid[objid];
449+
const mjtNum* cam_xpos = d->cam_xpos + 3*objid;
450+
const mjtNum* cam_xmat = d->cam_xmat + 9*objid;
451+
const int projection = m->cam_projection[objid];
452+
453+
// compute focal length in pixels using helper
454+
mjtNum fx, fy, cx, cy, ortho_extent;
455+
mju_camIntrinsics(m, objid, &fx, &fy, &cx, &cy, &ortho_extent);
456+
457+
if (projection == mjPROJ_PERSPECTIVE) {
458+
// perspective: all rays share origin, different directions
459+
const int npixel = width * height;
460+
mj_markStack(d);
461+
mjtNum* vec = mjSTACKALLOC(d, 3*npixel, mjtNum);
462+
int* geomid = mjSTACKALLOC(d, npixel, int);
463+
464+
// compute ray directions using helper (normalized)
465+
for (int row = 0; row < height; row++) {
466+
for (int col = 0; col < width; col++) {
467+
int idx = row*width + col;
468+
mjtNum origin[3];
469+
mju_camPixelRay(origin, vec + 3*idx, cam_xpos, cam_xmat,
470+
col, row, fx, fy, cx, cy, projection, ortho_extent);
471+
}
472+
}
473+
474+
// cast all rays
475+
mj_multiRay(m, d, cam_xpos, vec, NULL, 1, bodyexclude,
476+
geomid, d->sensordata + adr, npixel, mjMAXVAL);
477+
478+
mj_freeStack(d);
479+
} else {
480+
// orthographic: parallel rays, different origins
481+
for (int row = 0; row < height; row++) {
482+
for (int col = 0; col < width; col++) {
483+
int idx = row*width + col;
484+
mjtNum origin[3], direction[3];
485+
mju_camPixelRay(origin, direction, cam_xpos, cam_xmat,
486+
col, row, fx, fy, cx, cy, projection, ortho_extent);
487+
488+
d->sensordata[adr + idx] = mj_ray(m, d, origin, direction, NULL, 1,
489+
bodyexclude, NULL);
490+
}
491+
}
492+
}
493+
}
443494

444495
break;
445496

src/engine/engine_support.c

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -743,3 +743,43 @@ int mju_condataSize(int dataspec) {
743743
}
744744
return size;
745745
}
746+
747+
748+
// compute camera pixel parameters from model, output are:
749+
// pixel units: fx, fy (focal lengths), cx, cy (principal point)
750+
// length units: extent
751+
void mju_camIntrinsics(const mjModel* m, int camid,
752+
mjtNum* fx, mjtNum* fy, mjtNum* cx, mjtNum* cy, mjtNum* extent) {
753+
const int width = m->cam_resolution[2*camid];
754+
const int height = m->cam_resolution[2*camid+1];
755+
const float* sensorsize = m->cam_sensorsize + 2*camid;
756+
const float* intrinsic = m->cam_intrinsic + 4*camid;
757+
const mjtProjection projection = (mjtProjection)m->cam_projection[camid];
758+
759+
switch (projection) {
760+
case mjPROJ_PERSPECTIVE:
761+
if (sensorsize[0] && sensorsize[1]) {
762+
// intrinsic-based perspective camera
763+
*fx = intrinsic[0] / sensorsize[0] * width;
764+
*fy = intrinsic[1] / sensorsize[1] * height;
765+
*cx = intrinsic[2] / sensorsize[0] * width;
766+
*cy = intrinsic[3] / sensorsize[1] * height;
767+
} else {
768+
// fovy-based perspective camera
769+
*fx = *fy = 0.5 / mju_tan(m->cam_fovy[camid] * mjPI / 360.0) * height;
770+
*cx = (mjtNum)width / 2.0;
771+
*cy = (mjtNum)height / 2.0;
772+
}
773+
break;
774+
case mjPROJ_ORTHOGRAPHIC:
775+
// orthographic: normalize pixel offset to [-1, 1]
776+
*fx = (mjtNum)width / 2.0;
777+
*fy = (mjtNum)height / 2.0;
778+
*cx = *fx;
779+
*cy = *fy;
780+
break;
781+
}
782+
783+
// extent only used for orthographic cameras
784+
*extent = m->cam_fovy[camid];
785+
}

src/engine/engine_support.h

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -120,6 +120,12 @@ MJAPI const char* mj_versionString(void);
120120
// return total size of data fields in a contact sensor bitfield specification
121121
MJAPI int mju_condataSize(int dataSpec);
122122

123+
// compute camera pixel parameters from model
124+
// outputs: fx, fy (focal length in pixels), cx, cy (principal point), ortho_extent
125+
void mju_camIntrinsics(const mjModel* m, int camid,
126+
mjtNum* fx, mjtNum* fy, mjtNum* cx, mjtNum* cy,
127+
mjtNum* ortho_extent);
128+
123129
#ifdef __cplusplus
124130
}
125131
#endif

src/engine/engine_util_misc.c

Lines changed: 38 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -417,6 +417,8 @@ mjtNum mju_wrap(mjtNum wpnt[6], const mjtNum x0[3], const mjtNum x1[3],
417417
}
418418

419419

420+
//------------------------------ misc geometry -----------------------------------------------------
421+
420422
// all 3 semi-axes of a geom
421423
void mju_geomSemiAxes(mjtNum semiaxes[3], const mjtNum size[3], mjtGeom type) {
422424
switch (type) {
@@ -494,7 +496,42 @@ int mju_insideGeom(const mjtNum pos[3], const mjtNum mat[9], const mjtNum size[3
494496
}
495497

496498

497-
// ----------------------------- Flex interpolation ------------------------------------------------
499+
// compute ray origin and direction for pixel (col, row) in camera image
500+
// for perspective: origin is unchanged, direction is computed
501+
// for orthographic: direction is -Z in camera frame, origin is offset from camera center
502+
void mju_camPixelRay(mjtNum origin[3], mjtNum direction[3],
503+
const mjtNum cam_xpos[3], const mjtNum cam_xmat[9],
504+
int col, int row, mjtNum fx, mjtNum fy, mjtNum cx, mjtNum cy,
505+
int projection, mjtNum ortho_extent) {
506+
// pixel center (row 0 = top of image)
507+
mjtNum px = col + 0.5 - cx;
508+
mjtNum py = row + 0.5 - cy;
509+
510+
if (projection == mjPROJ_PERSPECTIVE) {
511+
// origin is camera position
512+
mju_copy3(origin, cam_xpos);
513+
514+
// direction in camera frame: (x/fx, -y/fy, -1), then normalized
515+
mjtNum dir_cam[3] = {px / fx, -py / fy, -1.0};
516+
mju_mulMatVec3(direction, cam_xmat, dir_cam);
517+
mju_normalize3(direction);
518+
} else {
519+
// orthographic: parallel rays, direction is -Z in camera frame
520+
direction[0] = -cam_xmat[2];
521+
direction[1] = -cam_xmat[5];
522+
direction[2] = -cam_xmat[8];
523+
524+
// origin offset in camera frame (ortho_extent is full height, use half for each side)
525+
mjtNum half_extent = ortho_extent / 2;
526+
mjtNum offset_cam[3] = {px / fx * half_extent, -py / fy * half_extent, 0};
527+
mjtNum offset_world[3];
528+
mju_mulMatVec3(offset_world, cam_xmat, offset_cam);
529+
mju_add3(origin, cam_xpos, offset_world);
530+
}
531+
}
532+
533+
534+
// ----------------------------- flex interpolation ------------------------------------------------
498535

499536
mjtNum static inline phi(mjtNum s, int i, int order) {
500537
if (order == 1) {

src/engine/engine_util_misc.h

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,13 @@ MJAPI void mju_geomSemiAxes(mjtNum semiaxes[3], const mjtNum size[3], mjtGeom ty
5757
int mju_insideGeom(const mjtNum pos[3], const mjtNum mat[9], const mjtNum size[3], mjtGeom type,
5858
const mjtNum point[3]);
5959

60+
// compute ray origin and direction for pixel (col, row) in camera image
61+
// directions are normalized so ray functions return actual 3D distance
62+
void mju_camPixelRay(mjtNum origin[3], mjtNum direction[3],
63+
const mjtNum cam_xpos[3], const mjtNum cam_xmat[9],
64+
int col, int row, mjtNum fx, mjtNum fy, mjtNum cx, mjtNum cy,
65+
int projection, mjtNum ortho_extent);
66+
6067
// ----------------------------- Flex interpolation ------------------------------------------------
6168

6269
// evaluate the deformation gradient at p using the nodal dof values

0 commit comments

Comments
 (0)