Marie Curie Action

Training Material

Medical Image Registration Taxonomy

By Alain Pitiot, PhD

Adapted from Maintz et.al. MedIA, 1998

First appeared in: Casciaro S., Distante A., Samset E. (editors) (2005) "Augmented Reality in Surgery", Text Book of the 2005 ARISER Summer School

adapted for web by Eigil Samset

Introduction


Source image +
transformed target image

Target image (PET thorax)

Within the current clinical setting, medical imaging is a vital component of a large number of applications. Such applications occur throughout the clinical track of events; not only within clinical diagnostic settings, but prominently so in the area of planning, consummation, and evaluation of surgical and radiotherapeutical procedures. Since information gained from two images acquired in the clinical track of events is usually of a complementary nature, proper integration of useful data obtained from the separate images is often desired. A first step in this integration process is to bring the modalities involved into spatial alignment, a procedure referred to as registration. An example of the use of registering different modalities can be found in radiotherapy treatment planning, where currently CT is used almost exclusively. However, the use of MR and CT combined would be beneficial, as the former is better suited for delineation of tumor tissue (and has in general better soft tissue contrast), while the latter is needed for accurate computation of the radiation dose. We give here a short taxonomy of the main registration approaches. This taxonomy is that of Maintz et al. in their landmark 1998 MedIA review. We have simplified it to fit a smaller format.

Subject


Homer Simpson. Homer Simpson
Sagittal MRI       (rest position)

When all of the images involved in a registration task are acquired of a single patient, we refer to it as intrasubject registration. If the registration is accomplished using two images of different patients (or a patient and a model), this is referred to as intersubject registration. If one image is acquired from a single patient, and the other image is somehow constructed from an image information database obtained using imaging of many subjects, we name it atlas registration.


Homer sapiens brain Homer Simpson


Homer Simpson Homer Simpson
(labelled atlas)

Intrasubject registration is by far the most common of the three, used in almost any type of diagnostic and interventional procedure. Intersubject and atlas registration appear mostly in 3D/3D MR or CT brain image applications. The nature of the registration transformation is mostly curved; these applications are always intrinsic, either segmentation based or voxel property based, using the full image content. A proper (manual) initialization is frequently desired. The use of intersubject and atlas matching can notably be found in the areas of gathering statistics on the size and shape of specific structures, finding (accordingly) anomalous structures, and transferring segmentations from one image to another.





Modalities involved in the registration


A few imaging modalities. Combination:

  • mono-modal: same modality for source and target
  • multi-modal: different modality

Four classes of registration tasks can be recognized based on the modalities that are involved. In monomodal applications, the images to be registered belong to the same modality, as opposed to multimodal registration tasks, where the images to be registered stem from two different modalities. In modality to model and patient to modality registration only one image is involved and the other modality is either a model or the patient himself. Hence we use the term modality in a loose sense, not only applying to acquired images, but also to mathematical models of anatomy or physiology, and even to the patient himself. Such inclusions are necessary to properly type-cast the four categories according to the actual registration task to be solved. For diagnostic purposes, two myocardial SPECT images are acquired of the patient, under rest and stress conditions. Their registration is a monomodal application. To relate an area of dysfunction to anatomy, a PET image is registered to an MR image. This is a multimodal application. To register an MR to a PET image, a PET image image is first simulated from the MR image, and the real and simulated PET images are registered. This is still a multimodal application. An example of modality to model is the registration of an MR brain image to a mathematically defined compartimental model of gross brain structures. The patient to modality registration tasks appear almost exclusively in intra-operative and radiotherapy applications. Modality to model can be applied in gathering statistics on tissue morphology (e.g., for finding anomalies relative to normalized structures), and to segmentation tasks. Monomodal tasks are well suited for growth monitoring, intervention verification, rest-stress comparisons, ictal-interictal comparisons, subtraction imaging (also DSA, CTA), and many other applications. The applications of multimodal registration are abundant and diverse, predominantly diagnostic in nature. A coarse division would be into anatomical-anatomical registration, where images showing different aspects of tissue morphology are combined, and functional-anatomical, where tissue metabolism and its spatial location relative to anatomical structures are related.

Dimensionality

Spatial registration methods

The main division here is whether all dimensions are spatial, or that time is an added dimension. In either case, the problem can be further categorized depending on the number of spatial dimensions involved.

3D/3D registration normally applies to the registration of two tomographic datasets, or the registration of a single tomographic image to any spatially defined information, e.g., a vector obtained from EEG data.

2D/2D registration may apply to separate slices from tomographic data, or intrinsically 2D images like portal images. Compared to 3D/3D registration, 2D/2D registration is less complex by an order of magnitude both where the number of parameters and the volume of the data are concerned, so obtaining a registration is in many cases easier and faster than in the 3D/3D case.

We reserve 2D/3D registration for the direct alignment of spatial data to projective data, (e.g., a pre-operative CT image to an intra-operative X-ray image), or the alignment a single tomographic slice to spatial data. Since most 2D/3D applications concern intra-operative procedures within the operating theater, they are heavily time-constrained and consequently have a strong focus on speed issues connected to the computation of the paradigm and the optimization. The majority of applications outside the operating theater and radiotherapy setting allow for off-line registration, so speed issues need only be addressed as constrained by clinical routine.

Registration of time series

Time series of images are acquired for various reasons, such as monitoring of bone growth in children (long time interval), monitoring of tumor growth (medium interval), post-operative monitoring of healing (short interval), or observing the passing of an injected bolus trough a vessel tree (ultra-short interval). If two images need to be compared, registration will be necessary except in instances of ultra-short time series, where the patient does not leave the scanner between the acquisitions of two images. The same observations as for spatial-only registrations apply.

Nature of registration basis

Extrinsic registration methods


Extrinsic Registration:
Stereotactic Frame

Image based registration can be divided into extrinsic, i.e., based on foreign objects introduced into the imaged space, and intrinsic methods, i.e., based on the image information as generated by the patient.

Extrinsic methods rely on artificial objects attached to the patient, objects which are designed to be well visible and accurately detectable in all of the pertinent modalities. As such, the registration of the acquired images is comparatively easy, fast, can usually be automated, and, since the registration parameters can often be computed explicitly, has no need for complex optimization algorithms. The main drawbacks of extrinsic registration are the prospective character, i.e., provisions must be made in the pre-acquisition phase, and the often invasive character of the marker objects.

Non-invasive markers can be used, but as a rule are less accurate. A commonly used fiducial object is a stereotactic frame screwed rigidly to the patientN"s outer skull table, a device which until recently provided the best gold standard" for registration accuracy. Such frames are used for localization and guidance purposes in neurosurgery. Sometimes other invasive objects are used, such as screw-mounted markers, but usually non-invasive marking devices are reverted to. Most popular amongst these are markers glued to the skin, but larger devices that can be fitted snugly to the patient, like individualized foam moulds.

Since extrinsic methods by definition cannot include patient related image information, the nature of the registration transformation is often restricted to be rigid (translations and rotations only). Because of the rigid transformation constraint, and various practical considerations, use of extrinsic 3D/3D methods is largely limited to brain and orthopedic imaging, although markers can often be used in projective (2D) imaging of any body area. Non-rigid transformations can in some cases be obtained using markers, e.g., in studies of animal heart motion, where markers can be implanted into the cardiac wall.

Intrinsic registration methods

Intrinsic methods rely on patient generated image content only. Registration can be based on a limited set of identified salient points (landmarks), on the alignment of segmented binary structures (segmentation based), most commonly object surfaces, or directly onto measures computed from the image grey values (voxel property based).

Landmark based registration methods


Lanmark based registration (CT-PET)

Landmarks can be anatomical, i.e., salient and accurately locatable points of the morphology of the visible anatomy, usually identified interactively by the user, or geometrical, i.e., points at the locus of the optimum of some geometric property, e.g., local curvature extrema, corners, etc, generally localized in an automatic fashion.

Landmark based registration is versatile in the sense that it at least in theory can be applied to any image, no matter what the object or subject is. Landmark based methods are mostly used to find rigid or affine transformations. If the sets of points are large enough, they can theoretically be used for more complex transformations. Anatomical landmarks are also often used in combination with an entirely different registration basis: methods that rely on optimization of a parameter space that is not quasi-convex are prone to sometimes get stuck in local optima, possibly resulting in a large mismatch. By constraining the search space according to anatomical landmarks, such mismatches are unlikely to occur. Moreover, the search procedure can be sped up considerably. A drawback is that user interaction is usually required for the identification of the landmarks.

In landmark based registration, the set of identified points is sparse compared to the original image content, which makes for relatively fast optimization procedures. Such algorithms optimize measures such as the average distance (norm) between each landmark and its closest counterpart (the Procrustean metric), or iterated minimal landmark distances. For the optimization of the latter measure the Iterative closest point (ICP) algorithm and derived methods are popular. Its popularity can be accredited to its versatility it can be used for point sets, and implicitly and explicitly defined curves, surfaces and volumes, computational speed, and ease of implementation. Yet other methods perform landmark registration by testing a number of likely transformation hypotheses, which can, e.g., be formulated by aligning three randomly picked points from each point set involved. Common optimization methods here are quasi-exhaustive searches, graph matching and dynamic programming approaches.

Segmentation based registration methods


Segmentation based registration (corpolla collosa)

Segmentation based registration methods can be rigid model based, where anatomically the same structures (mostly surfaces) are extracted from both images to be registered, and used as sole input for the alignment procedure. They can also be deformable model based, where an extracted structure (also mostly surfaces, and curves) from one image is elastically deformed to fit the second image.

The rigid model based approaches are probably the most popular methods currently in clinical use. Since the segmentation task is fairly easy to perform, and the computational complexity relatively low, the method has remained popular, and many follow-up papers aimed at automating the segmentation step, improving the optimization performance, or otherwise extending the method have been published. A drawback of segmentation based methods is that the registration accuracy is limited to the accuracy of the segmentation step.

With deformable models however, the optimization criterion is different: it is always locally defined and computed, and the deformation is constrained by elastic modeling constraints (by a regularization term) imposed onto the segmented curve or surface. Deformable curves appear in literature as snakes or active contours. To ease the physical modeling, the data structure of deformable models is not commonly a point set. Instead, it is often represented using localized functions such as splines. The deformation process is always done iteratively, small deformations at a time. Deformable model approaches are based on a template model that needs to be defined in one image. After this, two types of approaches can be identified: the template is either deformed to match a segmented structure in the second image, or the second image is used unsegmented. In the latter case, the fit criterion of the template can be, e.g., to lie on an edge region in the second image. Opposed to registration based on extracted rigid models, which is mainly suited for intrasubject registration, deformable models are in theory very well suited for intersubject and atlas registration, as well as for registration of a template obtained from a patient to a mathematically defined general model of the templated anatomy. A drawback of deformable models is that they often need a good initial position in order to properly converge, which is generally realized by (rigid) pre-registration of the images involved. Another disadvantage is that the local deformation of the template can be unpredictably erratic if the target structure differs sufficiently from the template structure. Deformable models are best suited to find local curved transformations between images, and less so for finding (global) rigid or affine transformations. They can be used on almost any anatomical area or modality, and are usually automated but for the segmentation step. In the current literature the major applications are registration of bone contours obtained from CT, and cortical registration of MR images.

Voxel property based registration methods


Voxel based registration
(cryo section, myelin-stained histological section)

The voxel property based registration methods stand apart from the other intrinsic methods3 by the fact that they operate directly on the image grey values, without prior data reduction by the user or segmentation. There are two distinct approaches: the first is to immediately reduce the image grey value content to a representative set of scalars and orientations, the second is to use the full image content throughout the registration process.

Principal axes and moments based methods are the prime examples of reductive registration methods. Within these methods the image center of gravity and its principal orientations (principal axes) are computed from the image zeroth and first order moments. Registration is then performed by aligning the center of gravity and the principal orientations. The result is usually not very accurate, and the method is not equipped to handle differences in scanned volume well. Despite its drawbacks, principal axes methods are widely used in registration problems that require no high accuracy, because of the automatic and very fast nature of its use, and the easy implementation. The method is used primarily in the re-alignment of scintigraphic cardiac studies (even intersubject), and as a coarse pre-registration in various other registration areas. Moment based methods also appear as hybridly classified registration methods that use segmented or binarized image data for input. In many applications, pre-segmentation is mandatory in order for moment based methods to produce acceptable results.

Voxel property based methods using the full image content are the most interesting methods researched currently. Theoretically, these are the most flexible of registration methods, since they unlike all other methods mentioned do not start with reducing the grey valued image to relatively sparse extracted information, but use all of the available information throughout the registration process. Although voxel property based methods have been around a long time, their use in extensive 3D/3D clinical applications has been limited by the considerable computational costs. An increasing clinical call for accurate and retrospective registration, along with the development of ever-faster computers with large internal memories, have enabled full-image-content methods to be used in clinical practice.

As concerns full-image-content based voxel property registration methods, literature reports on the following paradigms being used (_ = most likely restricted to monomodal applications): cross-correlation (of original images or extracted feature images), Fourier domain based cross-correlation and phase-only correlation, minimization of variance of intensity ratios, minimization of variance of grey values within segments, minimization of the histogram entropy of difference images, histogram clustering and minimization of histogram dispersion, maximization of mutual information (relative entropy) of the histogram, maximization of zero crossings in difference images (Stochastic sign change (SSC) and deterministic sign change (DSC) criterion), determination of the optic flow field, minimization of the absolute or squared intensity differences, matching local low-order Taylor expansions determined by the image grey values, implicitly using surface registration by interpreting a 3D image as an instance of a surface in 4D space, etc.

Nature and domain of the transformation

Nature of the transformation

An image coordinate transformation is called rigid, when only translations and rotations are allowed. If the transformation maps parallel lines onto parallel lines it is called affine. If it maps lines onto lines, it is called projective. Finally, if it maps lines onto curves, it is called curved or elastic. Each type of transformation contains as special cases the ones described before it, e.g., the rigid transformation is a special kind of affine transformation. Most applications represent curved transformations in terms of a local vector displacement (disparity) field or as polynomial transformations in terms of the old coordinates.

Domain of the transformation

A transformation is called global if it applies to the entire image, and local if subsections of the image each have their own transformations defined.

General transformation observations

source        global         local

Row 1: rigid, row 2: affine,
row 3: parameterized,
row 4: fluid/elastic

Local transformations are seldom used directly, because they may violate the local continuity and bijectiveness of the transformations, which impairs straightforward image resampling when applying the transformation to the image. The term local transformation is reserved for transformations that are composites of at least two transformations determined on sub-images that cannot be generally described as a global transformation. Hence, a single transformation computed on some volume of interest of an image, is a global transformation, except that "global" now refers to the new image, which is a sub-image of the original. This definition, perhaps confusingly, does not impair a global transformation to be computed locally, e.g., some applications compute a global rigid transformation of an image of the entire head based on computations done in the area of the facial surface only. Local rigid, affine, and projective transformations occur only rarely in the literature, although local rigid transformations may appear embedded in local curved transformations. Some problems that are intrinsically locally rigid (such as the individual vertebrae in an image of the spinal column) are in registration tasks often solved by splitting the image in images meeting the global rigid body constraint.

In recently published registration papers, as a rule, rigid and affine transformations are global, and curved transformations are local. This makes sense, given the physical model underlying the curved transformation type, and given that the rigid body constraint is globally, or in well defined sub-images approximately met in many common medical images. Affine transformations are typically used in instances of rigid body movement where the image scaling factors are unknown or suspected to be incorrect, (notably in MR images because of geometric distortions).

Since local information of the anatomy is essential to provide an accurate local curved transformation, applications are nearly always intrinsic, mostly deformable model based or using the full image content, and mostly semi-automatic, requiring a user-identified initialization. They appear almost solely using anatomical images (CT, MR) of the head, and are excellently suited for intersubject and image to atlas registration. Many methods require a pre-registration (initialization) using a rigid or affine transformation.

The global rigid transformation is used most frequently in registration applications. It is popular because in many common medical images the rigid body constraint is, at least to a good approximation, satisfied. Furthermore, it has relatively few parameters to be determined, and many registration techniques are not equipped to supply a more complex transformation. The most common application area is the human head.

Interaction

Concerning registration algorithms, three levels of interaction can be recognized: automatic, where the user only supplies the algorithm with the image data and possibly information on the image acquisition, interactive, where the user does the registration himself, assisted by software supplying a visual or numerical impression of the current transformation, and possibly an initial transformation guess, and semi-automatic, where the interaction required can be of two different natures: the user needs to initialize the algorithm, e.g., by segmenting the data, or steer the algorithm, e.g., by rejecting or accepting suggested registration hypotheses.

Many authors strive for fully automated algorithms, but it can be discussed whether this is wished for in all current clinical applications. The argument is that many current methods have a trade-off between minimal interaction and speed, accuracy, or robustness. Some methods would doubtlessly benefit if the user were "kept in the loop", steering the optimization, narrowing search space, or rejecting mismatches. On the other hand, many methods spent over 90% of their computation time examining registrations at a resolution level that would hardly benefit from human intervention. If they perform robustly, such methods are better left automated. Furthermore, many applications require registration algorithms to operate objectively, and thus allow no human interaction. Human interaction also complicates the validation of registration methods, inasmuch as it is a parameter not easily quantified or controlled.

Optimization procedure

The parameters that make up the registration transformation can either be computed directly, i.e., determined in an explicit fashion from the available data, or searched for, i.e., determined by finding an optimum of some function defined on the parameter space. In the former case, the manner of computation is completely determined by the paradigm.

In the case of searching optimization methods, most registration methods are able to formulate the paradigm in a standard mathematical function of the transformation parameters to be optimized. This function attempts to quantify the similarity as dictated by the paradigm between two images given a certain transformation. Such functions are generally less complex in monomodal registration applications, since the similarity is more straightforward to define. Hopefully, the similarity function is well-behaved (quasi-convex) so one of the standard and well-documented optimization techniques can be used.


Deterministic optimization: Gradient descent


Stochastic optimization: Simulated annealing

Popular techniques are PowellN"s method, the Downhill Simplex method, Brent's method and series of one-dimensional searches, Levenberg-Marquardt optimization, Newton-Raphson iteration, stochastic search methods, gradient descent methods, genetic methods, geometric hashing, and quasi-exhaustive search methods.

Frequent additions are multi-resolution (e.g., pyramid) and multiscale approaches to speed up convergence, to reduce the number of transformations to be examined (which is especially important in the quasi-exhaustive search methods) and to avoid local minima. Some registration methods employ non-standard optimization methods that are designed specifically for the similarity function at hand, such as the ICP algorithm created for rigid model based registration.

Many applications use more than one optimization technique, frequently a fast but coarse technique followed by an accurate yet slow one.

Use and validation

After a registration has been obtained, two questions appear paramount: How accurate is the computed registration? and How can it be used? The latter question presents us with an entire area of research of its own: the answer may be quite simple, e.g., only some statistical property of the subtracted registered images is required, to highly complex, e.g., a hybrid transparent stereo rendering that needs to be projected onto an operating microscope ocular is asked for. Such complex uses invariably require non-trivial visualizations in which segmentation must figure.

This creates a paradox: on the one hand, many registration applications show how intertwined the problems of registration and segmentation can be, and hence the designer of the registration algorithm is tempted to draw on his own expertise in answering the question on how the registration is to be used; indeed, this question must have figured in the registration algorithm design, which should have started out with a clinical need for registration. On the other hand, once a registration is obtained, the problem of How to use it? poses interdisciplinary problems of a previously unencountered nature. In other words: the areas of registration and visualization are still widely apart; not many registrations use state-of-the art visualization, nor do many visualizations use registered input. Such solitary stances can be observed concerning other research areas too: registration and segmentation have many a common interest, yet are seldom integrated. This can be accredited to the fact that registration research is relatively young area where many applications are concerned, to the fact that registration often involves new visualizations that possibly come with a steep interpretation learning-curve, to the fact that registration accuracy is often very hard to quantify sufficiently, to the logistic problems involved in integrating digital (or even analog) data from different machines often departments apart, to the extra equipment and time needed, and to the interdisciplinary gap.

The point of this long-winded periphrastic soliloquy is that the question how can the registration be used is for the most part still unanswered: even though the need for registration is born out of a clinical need, the track after obtaining the transformation parameters is still largely blank.

Example


Monomodal registration of two Summer School (2005) attendants


Monomodal registration: color overlay


target              source           global rigid        local affine     flexible


Monomodal registration: local affine


Monomodal registration: deformable

adapted from: J.B.Antoine Maintz & M.A. Viergever, "A Survey of Medical Image Registration", Medical Image Analysis, volume 2, number 1, pp. 1-37, 1998.