Skip to content

Camera Motion#

When the camera itself moves, the apparent motion of tracked objects becomes a mix of "the object moved" and "the camera moved". Norfair's camera motion module estimates the camera's frame-to-frame transformation from optical flow and lets the Tracker reason about positions in a stable world coordinate system.

The main entry point is MotionEstimator. It returns a CoordinatesTransformation for each frame which you pass to Tracker.update(coord_transformations=...). Two concrete transformation families are provided:

Example#

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
from norfair import Detection, Tracker, Video, draw_points
from norfair.camera_motion import MotionEstimator

tracker = Tracker(distance_function="euclidean", distance_threshold=50)
motion_estimator = MotionEstimator()

with Video(input_path="video.mp4") as video:
    for frame in video:
        coord_transformation = motion_estimator.update(frame)

        detections = [Detection(points=p) for p in my_detector(frame)]
        tracked_objects = tracker.update(
            detections=detections,
            coord_transformations=coord_transformation,
        )

        draw_points(frame, tracked_objects)
        video.write(frame)

You can pass a mask to motion_estimator.update() to ignore image regions that contain the tracked objects themselves (which would otherwise pollute the optical-flow estimate). A common pattern is to build the mask from the previous frame's detections or tracked objects.

For debugging, set MotionEstimator(draw_flow=True) to overlay the sampled optical-flow vectors on the frame, and use draw_absolute_grid together with FixedCamera to visualize the recovered world coordinate system.

API#

Camera motion estimation module.

Contains the abstract coordinate transformation interfaces, the built-in translation and homography implementations, and the :class:MotionEstimator that ties them to OpenCV's sparse optical flow.

CoordinatesTransformation #

Bases: ABC

Abstract base class representing a coordinate transformation.

Detection and tracked-object coordinates can be interpreted in two reference frames:

  • Relative — their position on the current frame, where (0, 0) is the top-left corner.
  • Absolute — their position in a fixed space, where (0, 0) is the top-left corner of the first frame of the video.

A CoordinatesTransformation can map coordinates from one reference to the other.

Source code in norfair/camera_motion.py
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
class CoordinatesTransformation(ABC):
    """Abstract base class representing a coordinate transformation.

    Detection and tracked-object coordinates can be interpreted in two
    reference frames:

    - **Relative** — their position on the current frame, where ``(0, 0)``
      is the top-left corner.
    - **Absolute** — their position in a fixed space, where ``(0, 0)`` is
      the top-left corner of the first frame of the video.

    A ``CoordinatesTransformation`` can map coordinates from one reference
    to the other.
    """

    @abstractmethod
    def abs_to_rel(self, points: NDArray[np.float64]) -> NDArray[np.float64]:
        """Map absolute-frame points to the current relative frame.

        Parameters
        ----------
        points : np.ndarray
            Array of shape ``(n_points, dim_points)`` in absolute
            coordinates.

        Returns
        -------
        np.ndarray
            Transformed points with the same shape, in relative
            coordinates.
        """

    @abstractmethod
    def rel_to_abs(self, points: NDArray[np.float64]) -> NDArray[np.float64]:
        """Map relative-frame points to the absolute frame.

        Parameters
        ----------
        points : np.ndarray
            Array of shape ``(n_points, dim_points)`` in relative
            coordinates.

        Returns
        -------
        np.ndarray
            Transformed points with the same shape, in absolute
            coordinates.
        """

abs_to_rel(points) abstractmethod #

Map absolute-frame points to the current relative frame.

Parameters:

Name Type Description Default
points ndarray

Array of shape (n_points, dim_points) in absolute coordinates.

required

Returns:

Type Description
ndarray

Transformed points with the same shape, in relative coordinates.

Source code in norfair/camera_motion.py
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
@abstractmethod
def abs_to_rel(self, points: NDArray[np.float64]) -> NDArray[np.float64]:
    """Map absolute-frame points to the current relative frame.

    Parameters
    ----------
    points : np.ndarray
        Array of shape ``(n_points, dim_points)`` in absolute
        coordinates.

    Returns
    -------
    np.ndarray
        Transformed points with the same shape, in relative
        coordinates.
    """

rel_to_abs(points) abstractmethod #

Map relative-frame points to the absolute frame.

Parameters:

Name Type Description Default
points ndarray

Array of shape (n_points, dim_points) in relative coordinates.

required

Returns:

Type Description
ndarray

Transformed points with the same shape, in absolute coordinates.

Source code in norfair/camera_motion.py
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
@abstractmethod
def rel_to_abs(self, points: NDArray[np.float64]) -> NDArray[np.float64]:
    """Map relative-frame points to the absolute frame.

    Parameters
    ----------
    points : np.ndarray
        Array of shape ``(n_points, dim_points)`` in relative
        coordinates.

    Returns
    -------
    np.ndarray
        Transformed points with the same shape, in absolute
        coordinates.
    """

TransformationGetter #

Bases: ABC

Abstract base class for objects that infer a CoordinatesTransformation.

Subclasses take two point clouds (previous and current frame features) and return a flag indicating whether the reference frame should be reset together with the inferred transformation.

Source code in norfair/camera_motion.py
78
79
80
81
82
83
84
85
86
87
88
89
90
class TransformationGetter(ABC):
    """Abstract base class for objects that infer a ``CoordinatesTransformation``.

    Subclasses take two point clouds (previous and current frame
    features) and return a flag indicating whether the reference frame
    should be reset together with the inferred transformation.
    """

    @abstractmethod
    def __call__(
        self, curr_pts: NDArray[np.float64], prev_pts: NDArray[np.float64]
    ) -> tuple[bool, CoordinatesTransformation | None]:
        """Return ``(update_reference, transformation)`` for the current pair."""

__call__(curr_pts, prev_pts) abstractmethod #

Return (update_reference, transformation) for the current pair.

Source code in norfair/camera_motion.py
86
87
88
89
90
@abstractmethod
def __call__(
    self, curr_pts: NDArray[np.float64], prev_pts: NDArray[np.float64]
) -> tuple[bool, CoordinatesTransformation | None]:
    """Return ``(update_reference, transformation)`` for the current pair."""

TranslationTransformation #

Bases: CoordinatesTransformation

Coordinate transformation using a simple 2D translation.

Parameters:

Name Type Description Default
movement_vector ndarray

The vector representing the translation.

required
Source code in norfair/camera_motion.py
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
class TranslationTransformation(CoordinatesTransformation):
    """Coordinate transformation using a simple 2D translation.

    Parameters
    ----------
    movement_vector : np.ndarray
        The vector representing the translation.

    """

    def __init__(self, movement_vector):
        """Store the ``movement_vector`` used for the translation."""
        self.movement_vector = movement_vector

    def abs_to_rel(self, points: NDArray[np.float64]) -> NDArray[np.float64]:
        """Translate absolute points into the current relative frame.

        Parameters
        ----------
        points : NDArray[np.float64]
            Array of shape ``(n_points, dim_points)`` in absolute
            coordinates.

        Returns
        -------
        NDArray[np.float64]
            Translated points with the same shape, in relative
            coordinates.
        """
        return points + self.movement_vector

    def rel_to_abs(self, points: NDArray[np.float64]) -> NDArray[np.float64]:
        """Translate relative points back into the absolute frame.

        Parameters
        ----------
        points : NDArray[np.float64]
            Array of shape ``(n_points, dim_points)`` in relative
            coordinates.

        Returns
        -------
        NDArray[np.float64]
            Translated points with the same shape, in absolute
            coordinates.
        """
        return points - self.movement_vector

__init__(movement_vector) #

Store the movement_vector used for the translation.

Source code in norfair/camera_motion.py
106
107
108
def __init__(self, movement_vector):
    """Store the ``movement_vector`` used for the translation."""
    self.movement_vector = movement_vector

abs_to_rel(points) #

Translate absolute points into the current relative frame.

Parameters:

Name Type Description Default
points NDArray[float64]

Array of shape (n_points, dim_points) in absolute coordinates.

required

Returns:

Type Description
NDArray[float64]

Translated points with the same shape, in relative coordinates.

Source code in norfair/camera_motion.py
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
def abs_to_rel(self, points: NDArray[np.float64]) -> NDArray[np.float64]:
    """Translate absolute points into the current relative frame.

    Parameters
    ----------
    points : NDArray[np.float64]
        Array of shape ``(n_points, dim_points)`` in absolute
        coordinates.

    Returns
    -------
    NDArray[np.float64]
        Translated points with the same shape, in relative
        coordinates.
    """
    return points + self.movement_vector

rel_to_abs(points) #

Translate relative points back into the absolute frame.

Parameters:

Name Type Description Default
points NDArray[float64]

Array of shape (n_points, dim_points) in relative coordinates.

required

Returns:

Type Description
NDArray[float64]

Translated points with the same shape, in absolute coordinates.

Source code in norfair/camera_motion.py
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
def rel_to_abs(self, points: NDArray[np.float64]) -> NDArray[np.float64]:
    """Translate relative points back into the absolute frame.

    Parameters
    ----------
    points : NDArray[np.float64]
        Array of shape ``(n_points, dim_points)`` in relative
        coordinates.

    Returns
    -------
    NDArray[np.float64]
        Translated points with the same shape, in absolute
        coordinates.
    """
    return points - self.movement_vector

TranslationTransformationGetter #

Bases: TransformationGetter

Compute a :class:TranslationTransformation from a pair of point clouds.

The camera movement is estimated as the mode of the optical flow between the previous reference frame and the current one.

Comparing consecutive frames can yield differences too small to estimate the translation reliably, so the reference frame is kept fixed as we progress through the video. Once the transformation can no longer match enough points, the reference frame is reset.

Parameters:

Name Type Description Default
bin_size float

Before calculating the mode, optical-flow vectors are bucketized into bins of this size.

0.2
proportion_points_used_threshold float

Proportion of points that must be matched; if the ratio drops below this value, the reference frame is updated.

0.9
Source code in norfair/camera_motion.py
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
class TranslationTransformationGetter(TransformationGetter):
    """Compute a :class:`TranslationTransformation` from a pair of point clouds.

    The camera movement is estimated as the mode of the optical flow
    between the previous reference frame and the current one.

    Comparing consecutive frames can yield differences too small to
    estimate the translation reliably, so the reference frame is kept
    fixed as we progress through the video. Once the transformation can
    no longer match enough points, the reference frame is reset.

    Parameters
    ----------
    bin_size : float
        Before calculating the mode, optical-flow vectors are bucketized
        into bins of this size.
    proportion_points_used_threshold : float
        Proportion of points that must be matched; if the ratio drops
        below this value, the reference frame is updated.

    """

    def __init__(
        self, bin_size: float = 0.2, proportion_points_used_threshold: float = 0.9
    ) -> None:
        """Store parameters and initialize the running flow accumulator."""
        self.bin_size = bin_size
        self.proportion_points_used_threshold = proportion_points_used_threshold
        self.data = None

    def __call__(
        self, curr_pts: NDArray[np.float64], prev_pts: NDArray[np.float64]
    ) -> tuple[bool, TranslationTransformation]:
        """Return the translation that best matches the optical flow."""
        # get flow
        flow = curr_pts - prev_pts

        # get mode
        flow = np.around(flow / self.bin_size) * self.bin_size
        unique_flows, counts = np.unique(flow, axis=0, return_counts=True)

        max_index = counts.argmax()

        proportion_points_used = counts[max_index] / len(prev_pts)
        update_prvs = proportion_points_used < self.proportion_points_used_threshold

        # ``unique_flows[max_index]`` is a view; copy to decouple the
        # accumulator from ``unique_flows``.
        flow_mode = unique_flows[max_index].copy()

        # ``self.data`` holds the accumulated translation since the first
        # frame; ``None`` on the very first call.
        if self.data is not None:
            flow_mode = flow_mode + self.data

        if update_prvs:
            # Separate copy so ``movement_vector`` and ``self.data`` stay
            # independent.
            self.data = flow_mode.copy()

        return update_prvs, TranslationTransformation(flow_mode)

__init__(bin_size=0.2, proportion_points_used_threshold=0.9) #

Store parameters and initialize the running flow accumulator.

Source code in norfair/camera_motion.py
167
168
169
170
171
172
173
def __init__(
    self, bin_size: float = 0.2, proportion_points_used_threshold: float = 0.9
) -> None:
    """Store parameters and initialize the running flow accumulator."""
    self.bin_size = bin_size
    self.proportion_points_used_threshold = proportion_points_used_threshold
    self.data = None

__call__(curr_pts, prev_pts) #

Return the translation that best matches the optical flow.

Source code in norfair/camera_motion.py
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
def __call__(
    self, curr_pts: NDArray[np.float64], prev_pts: NDArray[np.float64]
) -> tuple[bool, TranslationTransformation]:
    """Return the translation that best matches the optical flow."""
    # get flow
    flow = curr_pts - prev_pts

    # get mode
    flow = np.around(flow / self.bin_size) * self.bin_size
    unique_flows, counts = np.unique(flow, axis=0, return_counts=True)

    max_index = counts.argmax()

    proportion_points_used = counts[max_index] / len(prev_pts)
    update_prvs = proportion_points_used < self.proportion_points_used_threshold

    # ``unique_flows[max_index]`` is a view; copy to decouple the
    # accumulator from ``unique_flows``.
    flow_mode = unique_flows[max_index].copy()

    # ``self.data`` holds the accumulated translation since the first
    # frame; ``None`` on the very first call.
    if self.data is not None:
        flow_mode = flow_mode + self.data

    if update_prvs:
        # Separate copy so ``movement_vector`` and ``self.data`` stay
        # independent.
        self.data = flow_mode.copy()

    return update_prvs, TranslationTransformation(flow_mode)

HomographyTransformation #

Bases: CoordinatesTransformation

Coordinate transformation using a 3×3 homography matrix.

Parameters:

Name Type Description Default
homography_matrix ndarray

The matrix representing the homography.

required
Source code in norfair/camera_motion.py
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
class HomographyTransformation(CoordinatesTransformation):
    """Coordinate transformation using a 3×3 homography matrix.

    Parameters
    ----------
    homography_matrix : np.ndarray
        The matrix representing the homography.

    """

    def __init__(self, homography_matrix: np.ndarray):
        """Store the homography and pre-compute its inverse.

        Raises
        ------
        ValueError
            If ``homography_matrix`` is singular (non-invertible) and no
            mapping back from relative to absolute coordinates is possible.
        """
        self.homography_matrix = homography_matrix
        try:
            self.inverse_homography_matrix = np.linalg.inv(homography_matrix)
        except np.linalg.LinAlgError as exc:
            raise ValueError(
                "homography_matrix is singular (non-invertible); cannot compute "
                "rel_to_abs transform. Check your homography estimation "
                "(e.g. degenerate point configuration)."
            ) from exc

    def abs_to_rel(self, points: NDArray[np.float64]) -> NDArray[np.float64]:
        """Apply the forward homography to map absolute points to relative.

        Parameters
        ----------
        points : NDArray[np.float64]
            Array of shape ``(n_points, dim_points)`` or ``(dim_points,)``
            in absolute coordinates.

        Returns
        -------
        NDArray[np.float64]
            Transformed points with the same shape, in relative
            coordinates.
        """
        single_point = points.ndim == 1
        if single_point:
            points = points.reshape(1, -1)
        ones = np.ones((len(points), 1))
        points_with_ones = np.hstack((points, ones))
        points_transformed = points_with_ones @ self.homography_matrix.T
        last_column = points_transformed[:, -1]
        last_column = np.where(last_column == 0, np.finfo(float).eps, last_column)
        points_transformed = points_transformed / last_column.reshape(-1, 1)
        result = points_transformed[:, :2]
        if single_point:
            return result.flatten()
        return result

    def rel_to_abs(self, points: NDArray[np.float64]) -> NDArray[np.float64]:
        """Apply the inverse homography to map relative points to absolute.

        Parameters
        ----------
        points : NDArray[np.float64]
            Array of shape ``(n_points, dim_points)`` or ``(dim_points,)``
            in relative coordinates.

        Returns
        -------
        NDArray[np.float64]
            Transformed points with the same shape, in absolute
            coordinates.
        """
        single_point = points.ndim == 1
        if single_point:
            points = points.reshape(1, -1)
        ones = np.ones((len(points), 1))
        points_with_ones = np.hstack((points, ones))
        points_transformed = points_with_ones @ self.inverse_homography_matrix.T
        last_column = points_transformed[:, -1]
        last_column = np.where(last_column == 0, np.finfo(float).eps, last_column)
        points_transformed = points_transformed / last_column.reshape(-1, 1)
        result = points_transformed[:, :2]
        if single_point:
            return result.flatten()
        return result

__init__(homography_matrix) #

Store the homography and pre-compute its inverse.

Raises:

Type Description
ValueError

If homography_matrix is singular (non-invertible) and no mapping back from relative to absolute coordinates is possible.

Source code in norfair/camera_motion.py
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
def __init__(self, homography_matrix: np.ndarray):
    """Store the homography and pre-compute its inverse.

    Raises
    ------
    ValueError
        If ``homography_matrix`` is singular (non-invertible) and no
        mapping back from relative to absolute coordinates is possible.
    """
    self.homography_matrix = homography_matrix
    try:
        self.inverse_homography_matrix = np.linalg.inv(homography_matrix)
    except np.linalg.LinAlgError as exc:
        raise ValueError(
            "homography_matrix is singular (non-invertible); cannot compute "
            "rel_to_abs transform. Check your homography estimation "
            "(e.g. degenerate point configuration)."
        ) from exc

abs_to_rel(points) #

Apply the forward homography to map absolute points to relative.

Parameters:

Name Type Description Default
points NDArray[float64]

Array of shape (n_points, dim_points) or (dim_points,) in absolute coordinates.

required

Returns:

Type Description
NDArray[float64]

Transformed points with the same shape, in relative coordinates.

Source code in norfair/camera_motion.py
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
def abs_to_rel(self, points: NDArray[np.float64]) -> NDArray[np.float64]:
    """Apply the forward homography to map absolute points to relative.

    Parameters
    ----------
    points : NDArray[np.float64]
        Array of shape ``(n_points, dim_points)`` or ``(dim_points,)``
        in absolute coordinates.

    Returns
    -------
    NDArray[np.float64]
        Transformed points with the same shape, in relative
        coordinates.
    """
    single_point = points.ndim == 1
    if single_point:
        points = points.reshape(1, -1)
    ones = np.ones((len(points), 1))
    points_with_ones = np.hstack((points, ones))
    points_transformed = points_with_ones @ self.homography_matrix.T
    last_column = points_transformed[:, -1]
    last_column = np.where(last_column == 0, np.finfo(float).eps, last_column)
    points_transformed = points_transformed / last_column.reshape(-1, 1)
    result = points_transformed[:, :2]
    if single_point:
        return result.flatten()
    return result

rel_to_abs(points) #

Apply the inverse homography to map relative points to absolute.

Parameters:

Name Type Description Default
points NDArray[float64]

Array of shape (n_points, dim_points) or (dim_points,) in relative coordinates.

required

Returns:

Type Description
NDArray[float64]

Transformed points with the same shape, in absolute coordinates.

Source code in norfair/camera_motion.py
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
def rel_to_abs(self, points: NDArray[np.float64]) -> NDArray[np.float64]:
    """Apply the inverse homography to map relative points to absolute.

    Parameters
    ----------
    points : NDArray[np.float64]
        Array of shape ``(n_points, dim_points)`` or ``(dim_points,)``
        in relative coordinates.

    Returns
    -------
    NDArray[np.float64]
        Transformed points with the same shape, in absolute
        coordinates.
    """
    single_point = points.ndim == 1
    if single_point:
        points = points.reshape(1, -1)
    ones = np.ones((len(points), 1))
    points_with_ones = np.hstack((points, ones))
    points_transformed = points_with_ones @ self.inverse_homography_matrix.T
    last_column = points_transformed[:, -1]
    last_column = np.where(last_column == 0, np.finfo(float).eps, last_column)
    points_transformed = points_transformed / last_column.reshape(-1, 1)
    result = points_transformed[:, :2]
    if single_point:
        return result.flatten()
    return result

HomographyTransformationGetter #

Bases: TransformationGetter

Compute a :class:HomographyTransformation from a pair of point clouds.

The camera movement is represented as a homography that maps the optical flow between the previous reference frame and the current one.

Comparing consecutive frames can make differences too small to estimate the homography reliably, often collapsing to the identity. The reference frame is therefore kept fixed as we progress through the video; once the transformation can no longer match enough points, it is reset.

Parameters:

Name Type Description Default
method int

One of OpenCV's methods for finding homographies. Valid options are [0, cv2.RANSAC, cv2.LMEDS, cv2.RHO]. Defaults to cv2.RANSAC.

None
ransac_reproj_threshold int

Maximum allowed reprojection error to treat a point pair as an inlier. See the OpenCV docs linked below for details.

3
max_iters int

The maximum number of RANSAC iterations. See the OpenCV docs linked below for details.

2000
confidence float

Confidence level, must be between 0 and 1. See the OpenCV docs linked below for details.

0.995
proportion_points_used_threshold float

Proportion of points that must be matched; if the ratio drops below this value, the reference frame is updated.

0.9
See Also

cv2.findHomography

Source code in norfair/camera_motion.py
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
class HomographyTransformationGetter(TransformationGetter):
    """Compute a :class:`HomographyTransformation` from a pair of point clouds.

    The camera movement is represented as a homography that maps the
    optical flow between the previous reference frame and the current one.

    Comparing consecutive frames can make differences too small to
    estimate the homography reliably, often collapsing to the identity.
    The reference frame is therefore kept fixed as we progress through
    the video; once the transformation can no longer match enough points,
    it is reset.

    Parameters
    ----------
    method : int, optional
        One of OpenCV's methods for finding homographies. Valid options
        are ``[0, cv2.RANSAC, cv2.LMEDS, cv2.RHO]``. Defaults to
        ``cv2.RANSAC``.
    ransac_reproj_threshold : int, optional
        Maximum allowed reprojection error to treat a point pair as an
        inlier. See the OpenCV docs linked below for details.
    max_iters : int, optional
        The maximum number of RANSAC iterations. See the OpenCV docs
        linked below for details.
    confidence : float, optional
        Confidence level, must be between 0 and 1. See the OpenCV docs
        linked below for details.
    proportion_points_used_threshold : float, optional
        Proportion of points that must be matched; if the ratio drops
        below this value, the reference frame is updated.

    See Also
    --------
    [`cv2.findHomography`](https://docs.opencv.org/3.4/d9/d0c/group__calib3d.html#ga4abc2ece9fab9398f2e560d53c8c9780)

    """

    def __init__(
        self,
        method: int | None = None,
        ransac_reproj_threshold: int = 3,
        max_iters: int = 2000,
        confidence: float = 0.995,
        proportion_points_used_threshold: float = 0.9,
    ) -> None:
        """Store RANSAC parameters and initialize the running homography."""
        self.data = None
        if method is None:
            method = cv2.RANSAC
        self.method = method
        self.ransac_reproj_threshold = ransac_reproj_threshold
        self.max_iters = max_iters
        self.confidence = confidence
        self.proportion_points_used_threshold = proportion_points_used_threshold

    def __call__(
        self, curr_pts: NDArray[np.float64], prev_pts: NDArray[np.float64]
    ) -> tuple[bool, HomographyTransformation | None]:
        """Return the homography that best matches the optical flow."""
        if not (
            isinstance(prev_pts, np.ndarray)
            and prev_pts.shape[0] >= 4
            and isinstance(curr_pts, np.ndarray)
            and curr_pts.shape[0] >= 4
        ):
            logger.warning(
                "The homography couldn't be computed in this frame "
                "due to low amount of points"
            )
            if isinstance(self.data, np.ndarray):
                return True, HomographyTransformation(self.data)
            else:
                return True, None

        homography_matrix, points_used = cv2.findHomography(
            prev_pts,
            curr_pts,
            method=self.method,
            ransacReprojThreshold=self.ransac_reproj_threshold,
            maxIters=self.max_iters,
            confidence=self.confidence,
        )

        if homography_matrix is None or points_used is None:
            logger.warning(
                "Homography estimation failed for this frame "
                "(degenerate/collinear/insufficient inlier points)"
            )
            if isinstance(self.data, np.ndarray):
                return True, HomographyTransformation(self.data)
            else:
                return True, None

        proportion_points_used = np.sum(points_used) / len(points_used)

        update_prvs = proportion_points_used < self.proportion_points_used_threshold

        if self.data is not None:
            homography_matrix = homography_matrix @ self.data

        if update_prvs:
            self.data = homography_matrix

        return bool(update_prvs), HomographyTransformation(homography_matrix)

__init__(method=None, ransac_reproj_threshold=3, max_iters=2000, confidence=0.995, proportion_points_used_threshold=0.9) #

Store RANSAC parameters and initialize the running homography.

Source code in norfair/camera_motion.py
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
def __init__(
    self,
    method: int | None = None,
    ransac_reproj_threshold: int = 3,
    max_iters: int = 2000,
    confidence: float = 0.995,
    proportion_points_used_threshold: float = 0.9,
) -> None:
    """Store RANSAC parameters and initialize the running homography."""
    self.data = None
    if method is None:
        method = cv2.RANSAC
    self.method = method
    self.ransac_reproj_threshold = ransac_reproj_threshold
    self.max_iters = max_iters
    self.confidence = confidence
    self.proportion_points_used_threshold = proportion_points_used_threshold

__call__(curr_pts, prev_pts) #

Return the homography that best matches the optical flow.

Source code in norfair/camera_motion.py
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
def __call__(
    self, curr_pts: NDArray[np.float64], prev_pts: NDArray[np.float64]
) -> tuple[bool, HomographyTransformation | None]:
    """Return the homography that best matches the optical flow."""
    if not (
        isinstance(prev_pts, np.ndarray)
        and prev_pts.shape[0] >= 4
        and isinstance(curr_pts, np.ndarray)
        and curr_pts.shape[0] >= 4
    ):
        logger.warning(
            "The homography couldn't be computed in this frame "
            "due to low amount of points"
        )
        if isinstance(self.data, np.ndarray):
            return True, HomographyTransformation(self.data)
        else:
            return True, None

    homography_matrix, points_used = cv2.findHomography(
        prev_pts,
        curr_pts,
        method=self.method,
        ransacReprojThreshold=self.ransac_reproj_threshold,
        maxIters=self.max_iters,
        confidence=self.confidence,
    )

    if homography_matrix is None or points_used is None:
        logger.warning(
            "Homography estimation failed for this frame "
            "(degenerate/collinear/insufficient inlier points)"
        )
        if isinstance(self.data, np.ndarray):
            return True, HomographyTransformation(self.data)
        else:
            return True, None

    proportion_points_used = np.sum(points_used) / len(points_used)

    update_prvs = proportion_points_used < self.proportion_points_used_threshold

    if self.data is not None:
        homography_matrix = homography_matrix @ self.data

    if update_prvs:
        self.data = homography_matrix

    return bool(update_prvs), HomographyTransformation(homography_matrix)

MotionEstimator #

Camera motion estimator driven by sparse optical flow.

Uses OpenCV optical flow on a set of strong corner features to estimate the motion of the camera from frame to frame and feeds the result through a :class:TransformationGetter to recover a :class:CoordinatesTransformation.

Parameters:

Name Type Description Default
max_points int

Maximum number of points sampled. More points make the estimation slower but more precise.

200
min_distance int

Minimum distance between sampled points.

15
block_size int

Size of the averaging block used when finding corners. See the OpenCV link below for details.

3
transformations_getter TransformationGetter

The transformation estimator used on the sampled points. Defaults to HomographyTransformationGetter.

None
draw_flow bool

Draw the optical flow on the frame in place, for debugging.

False
flow_color tuple[int, int, int]

BGR color for the flow drawing. Defaults to a dark blue.

None
quality_level float

Minimum accepted quality of the image corners.

0.01

Examples:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
>>> from norfair import Tracker, Video
>>> from norfair.camera_motion import MotionEstimator
>>> video = Video(input_path="video.mp4")
>>> tracker = Tracker(...)
>>> motion_estimator = MotionEstimator()
>>> for frame in video:
...     detections = get_detections(frame)
...     coord_transformation = motion_estimator.update(frame)
...     tracked_objects = tracker.update(
...         detections, coord_transformations=coord_transformation
...     )
See Also

cv2.goodFeaturesToTrack

Source code in norfair/camera_motion.py
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
class MotionEstimator:
    """Camera motion estimator driven by sparse optical flow.

    Uses OpenCV optical flow on a set of strong corner features to
    estimate the motion of the camera from frame to frame and feeds the
    result through a :class:`TransformationGetter` to recover a
    :class:`CoordinatesTransformation`.

    Parameters
    ----------
    max_points : int, optional
        Maximum number of points sampled. More points make the estimation
        slower but more precise.
    min_distance : int, optional
        Minimum distance between sampled points.
    block_size : int, optional
        Size of the averaging block used when finding corners. See the
        OpenCV link below for details.
    transformations_getter : TransformationGetter, optional
        The transformation estimator used on the sampled points. Defaults
        to
        [`HomographyTransformationGetter`][norfair.camera_motion.HomographyTransformationGetter].
    draw_flow : bool, optional
        Draw the optical flow on the frame in place, for debugging.
    flow_color : tuple[int, int, int], optional
        BGR color for the flow drawing. Defaults to a dark blue.
    quality_level : float, optional
        Minimum accepted quality of the image corners.

    Examples
    --------
    >>> from norfair import Tracker, Video
    >>> from norfair.camera_motion import MotionEstimator
    >>> video = Video(input_path="video.mp4")
    >>> tracker = Tracker(...)
    >>> motion_estimator = MotionEstimator()
    >>> for frame in video:
    ...     detections = get_detections(frame)
    ...     coord_transformation = motion_estimator.update(frame)
    ...     tracked_objects = tracker.update(
    ...         detections, coord_transformations=coord_transformation
    ...     )

    See Also
    --------
    [`cv2.goodFeaturesToTrack`](https://docs.opencv.org/3.4/dd/d1a/group__imgproc__feature.html#ga1d6bb77486c8f92d79c8793ad995d541)

    """

    def __init__(
        self,
        max_points: int = 200,
        min_distance: int = 15,
        block_size: int = 3,
        transformations_getter: TransformationGetter | None = None,
        draw_flow: bool = False,
        flow_color: tuple[int, int, int] | None = None,
        quality_level: float = 0.01,
    ):
        """Initialize sampling parameters and the transformation getter."""
        self.max_points = max_points
        self.min_distance = min_distance
        self.block_size = block_size

        self.draw_flow = draw_flow
        if self.draw_flow and flow_color is None:
            flow_color = (0, 0, 100)
        self.flow_color = flow_color

        self.gray_prvs = None
        self.prev_pts = None
        if transformations_getter is None:
            transformations_getter = HomographyTransformationGetter()

        self.transformations_getter = transformations_getter
        self.transformations_getter_copy = copy.deepcopy(transformations_getter)

        self.prev_mask = None
        self.gray_next = None
        self.quality_level = quality_level

    def update(
        self, frame: np.ndarray, mask: np.ndarray | None = None
    ) -> CoordinatesTransformation | None:
        """Estimate the camera motion for one frame.

        Parameters
        ----------
        frame : np.ndarray
            The current video frame.
        mask : np.ndarray, optional
            Optional mask excluding regions from corner sampling. Must
            have shape ``(frame.shape[0], frame.shape[1])``, dtype
            ``np.uint8``, and contain values ``0`` (ignore) or ``255``
            (consider), as required by ``cv2.goodFeaturesToTrack``.

            In general, the estimation works best when many points come
            from the background, so this parameter is useful for masking
            out detections or tracked objects and forcing the estimator
            to ignore moving objects. It can also be used to mask static
            overlays like sport scoreboards or security-camera
            timestamps.

        Returns
        -------
        CoordinatesTransformation or None
            A coordinate transformation that can map coordinates on this
            frame to absolute coordinates and vice versa, or ``None`` if
            the transformation could not be recovered.

        """
        self.gray_next = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
        if self.gray_prvs is None:
            self.gray_prvs = self.gray_next
            self.prev_mask = mask

        curr_pts, prev_pts = None, None
        try:
            curr_pts, prev_pts = _get_sparse_flow(
                self.gray_next,
                self.gray_prvs,
                self.prev_pts,
                self.max_points,
                self.min_distance,
                self.block_size,
                self.prev_mask,
                quality_level=self.quality_level,
            )
            if self.draw_flow and self.flow_color is not None:
                for curr, prev in zip(curr_pts, prev_pts):
                    c = tuple(curr.astype(int).ravel())
                    p = tuple(prev.astype(int).ravel())
                    cv2.line(frame, c, p, self.flow_color, 2)
                    cv2.circle(frame, c, 3, self.flow_color, -1)
        except (cv2.error, ValueError, TypeError) as e:
            logger.warning(e)

        update_prvs, coord_transformations = True, None
        if curr_pts is not None and prev_pts is not None:
            try:
                update_prvs, coord_transformations = self.transformations_getter(
                    curr_pts.astype(np.float64, copy=False),
                    prev_pts.astype(np.float64, copy=False),
                )
            except (TypeError, ValueError, np.linalg.LinAlgError) as e:
                logger.warning(e)
                del self.transformations_getter
                self.transformations_getter = copy.deepcopy(
                    self.transformations_getter_copy
                )

        if update_prvs:
            self.gray_prvs = self.gray_next
            self.prev_pts = None
            self.prev_mask = mask
        else:
            self.prev_pts = prev_pts

        return coord_transformations

__init__(max_points=200, min_distance=15, block_size=3, transformations_getter=None, draw_flow=False, flow_color=None, quality_level=0.01) #

Initialize sampling parameters and the transformation getter.

Source code in norfair/camera_motion.py
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
def __init__(
    self,
    max_points: int = 200,
    min_distance: int = 15,
    block_size: int = 3,
    transformations_getter: TransformationGetter | None = None,
    draw_flow: bool = False,
    flow_color: tuple[int, int, int] | None = None,
    quality_level: float = 0.01,
):
    """Initialize sampling parameters and the transformation getter."""
    self.max_points = max_points
    self.min_distance = min_distance
    self.block_size = block_size

    self.draw_flow = draw_flow
    if self.draw_flow and flow_color is None:
        flow_color = (0, 0, 100)
    self.flow_color = flow_color

    self.gray_prvs = None
    self.prev_pts = None
    if transformations_getter is None:
        transformations_getter = HomographyTransformationGetter()

    self.transformations_getter = transformations_getter
    self.transformations_getter_copy = copy.deepcopy(transformations_getter)

    self.prev_mask = None
    self.gray_next = None
    self.quality_level = quality_level

update(frame, mask=None) #

Estimate the camera motion for one frame.

Parameters:

Name Type Description Default
frame ndarray

The current video frame.

required
mask ndarray

Optional mask excluding regions from corner sampling. Must have shape (frame.shape[0], frame.shape[1]), dtype np.uint8, and contain values 0 (ignore) or 255 (consider), as required by cv2.goodFeaturesToTrack.

In general, the estimation works best when many points come from the background, so this parameter is useful for masking out detections or tracked objects and forcing the estimator to ignore moving objects. It can also be used to mask static overlays like sport scoreboards or security-camera timestamps.

None

Returns:

Type Description
CoordinatesTransformation or None

A coordinate transformation that can map coordinates on this frame to absolute coordinates and vice versa, or None if the transformation could not be recovered.

Source code in norfair/camera_motion.py
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
def update(
    self, frame: np.ndarray, mask: np.ndarray | None = None
) -> CoordinatesTransformation | None:
    """Estimate the camera motion for one frame.

    Parameters
    ----------
    frame : np.ndarray
        The current video frame.
    mask : np.ndarray, optional
        Optional mask excluding regions from corner sampling. Must
        have shape ``(frame.shape[0], frame.shape[1])``, dtype
        ``np.uint8``, and contain values ``0`` (ignore) or ``255``
        (consider), as required by ``cv2.goodFeaturesToTrack``.

        In general, the estimation works best when many points come
        from the background, so this parameter is useful for masking
        out detections or tracked objects and forcing the estimator
        to ignore moving objects. It can also be used to mask static
        overlays like sport scoreboards or security-camera
        timestamps.

    Returns
    -------
    CoordinatesTransformation or None
        A coordinate transformation that can map coordinates on this
        frame to absolute coordinates and vice versa, or ``None`` if
        the transformation could not be recovered.

    """
    self.gray_next = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    if self.gray_prvs is None:
        self.gray_prvs = self.gray_next
        self.prev_mask = mask

    curr_pts, prev_pts = None, None
    try:
        curr_pts, prev_pts = _get_sparse_flow(
            self.gray_next,
            self.gray_prvs,
            self.prev_pts,
            self.max_points,
            self.min_distance,
            self.block_size,
            self.prev_mask,
            quality_level=self.quality_level,
        )
        if self.draw_flow and self.flow_color is not None:
            for curr, prev in zip(curr_pts, prev_pts):
                c = tuple(curr.astype(int).ravel())
                p = tuple(prev.astype(int).ravel())
                cv2.line(frame, c, p, self.flow_color, 2)
                cv2.circle(frame, c, 3, self.flow_color, -1)
    except (cv2.error, ValueError, TypeError) as e:
        logger.warning(e)

    update_prvs, coord_transformations = True, None
    if curr_pts is not None and prev_pts is not None:
        try:
            update_prvs, coord_transformations = self.transformations_getter(
                curr_pts.astype(np.float64, copy=False),
                prev_pts.astype(np.float64, copy=False),
            )
        except (TypeError, ValueError, np.linalg.LinAlgError) as e:
            logger.warning(e)
            del self.transformations_getter
            self.transformations_getter = copy.deepcopy(
                self.transformations_getter_copy
            )

    if update_prvs:
        self.gray_prvs = self.gray_next
        self.prev_pts = None
        self.prev_mask = mask
    else:
        self.prev_pts = prev_pts

    return coord_transformations

See also#