Default Behavior and Extensibility

All main components of EMF Compare have been designed for extensibility. Some are only extensible when comparing models through your own actions, some can be customized globally for a given kind of model or metamodel... We'll outline the customization options of all 6 comparison phases in this section. (Any dead link? Report them on the forum!)

Model Resolving

PENDING description of the phase, extensibility (use of the modelProviders extension point, custom ext point of compare)

Match

Before we can compute differences between two versions of a same Object, we must determine which are actually the "same" Object. For example, let's consider that my first model contains a Package P1 which itself contains a class C1; and that my second model contains a package P1 which contains a class C1. It may seem obvious for a human reader that "P1" and "C1" are the same object in both models. However, since their features might have changed in-between the two versions (for example, the "C1" might now be abstract, or it could have been converted to an Interface), this "equality" is not that obvious for a computer.

The goal of the "Match" phase is to discover which of the objects from model 2 match with which objects of model 1. In other words, this is when we'll say that two objects are one and the same, and that any difference between the two sides of this couple is actually a difference that should be reported as such to the user.

By default, EMF Compare browses through elements that are within the scope, and matches them through their identifier if they have one, r through a distance mechanism for all elements that have none. If the scope contains resources, EMF Compare will first match those two-by-two before browsing through all of their contained objects.

EMF Compare "finds" the identifier of given object through a basic function that can be found in IdentifierEObjectMatcher.DefaultIDFunction. In short, if the object is a proxy, its identifier is its URI fragment. Otherwise its functional ID (in ecore, an attribute that serves as an identifier) takes precedence over its XMI ID (the identifier it was given in the XMI file). If the object is not a proxy and has neither functional nor XMI identifier, then the default behavior will simply pass that object over to the proximity algorithms so that it can be matched through its distance with other objects.

PENDING: brief description of the proximity algorithm

This behavior can be customized in a number of ways.

Overriding the Match engine

The most powerful (albeit most cumbersome) customization you can implement is to override the match engine EMF Compare uses. To this end you can either [ http://git.eclipse.org/c/emfcompare/org.eclipse.emf.compare.git/tree/plugins/org.eclipse.emf.compare/src/org/eclipse/emf/compare/match/IMatchEngine.java implement the whole contract, IMatchEngine], in which case you will have to carefully follow the javadoc's recommandations, or extend the [ http://git.eclipse.org/c/emfcompare/org.eclipse.emf.compare.git/tree/plugins/org.eclipse.emf.compare/src/org/eclipse/emf/compare/match/DefaultMatchEngine.java default implementation, DefaultMatchEngine].

A custom match engine can be used for your model comparison needs:

// for standalone usage
IMatchEngine.Factory.Registry registry = MatchEngineFactoryRegistryImpl.createStandaloneInstance();
// for OSGi (IDE, RCP) usage
// IMatchEngine.Factory.Registry registry = EMFCompareRCPPlugin.getMatchEngineFactoryRegistry();
final IMatchEngine customMatchEngine = new MyMatchEngine(...);
IMatchEngine.Factory engineFactory = new MatchEngineFactoryImpl() {
  public IMatchEngine getMatchEngine() {
    return customMatchEngine;
  }
};
engineFactory.setRanking(20); // default engine ranking is 10, must be higher to override.
registry.add(engineFactory);
EMFCompare.builder().setMatchEngineFactoryRegistry(registry).build().compare(scope);

Changing how resources are matched

By default, the logic EMF Compare uses to match resources together is very simple: if two resources have the same name (strict equality on the name, without considering folders), they match. When this is not sufficient, EMF Compare will look at the XMI ID of the resources' root(s). If the two resources share at least one root with an equal XMI ID, they match.

This can be changed only by implementing your own subclass of the DefaultMatchEngine and overriding its resource matcher. The method of interest here is DefaultMatchEngine#createResourceMatcher().

Defining custom identifiers

In some cases, there might be ways to identify your objects via the use of "identifiers" that cannot be identified as such by the default mechanism. For example, you might want each of your objects to be matched through their name alone, or through the composition of their name and their type... This can be achieved through code by simply redefining the function EMF Compare uses to find the ID of an object. The following code will tell EMF Compare that the identifier of all "MyEObject" elements is their name, and that any other element should go through the default behavior.

Function<EObject, String> idFunction = new Function<EObject, String>() {
	public String apply(EObject input) {
		if (input instanceof MyEObject) {
			return ((MyEObject)input).getName();
		}
		// a null return here tells the match engine to fall back to the other matchers
		return null;
	}
};
// Using this matcher as fall back, EMF Compare will still search for XMI IDs on EObjects
// for which we had no custom id function.
IEObjectMatcher fallBackMatcher = DefaultMatchEngine.createDefaultEObjectMatcher(UseIdentifiers.WHEN_AVAILABLE);
IEObjectMatcher customIDMatcher = new IdentifierEObjectMatcher(fallBackMatcher, idFunction);
 
IComparisonFactory comparisonFactory = new DefaultComparisonFactory(new DefaultEqualityHelperFactory());
 
IMatchEngine matchEngine = new DefaultMatchEngine(customIDMatcher, comparisonFactory);
IMatchEngine.Factory.Registry registry = MatchEngineFactoryRegistryImpl.createStandaloneInstance();
// for OSGi (IDE, RCP) usage
// IMatchEngine.Factory.Registry registry = EMFCompareRCPPlugin.getMatchEngineFactoryRegistry();
engineFactory.setRanking(20); // default engine ranking is 10, must be higher to override.
registry.add(engineFactory);
EMFCompare.builder().setMatchEngineFactoryRegistry(registry).build().compare(scope);

Ignoring identifiers

There are some cases where you do not want the identifiers of your elements to be taken into account when matching the objects. This can easily be done when calling for comparisons programmatically:

Through code

IEObjectMatcher matcher = DefaultMatchEngine.createDefaultEObjectMatcher(UseIdentifiers.NEVER);
IComparisonFactory comparisonFactory = new DefaultComparisonFactory(new DefaultEqualityHelperFactory());

IMatchEngine matchEngine = new DefaultMatchEngine(matcher , comparisonFactory);
EMFCompare.builder().setMatchEngine(matchEngine).build().compare(scope);

From the user interface

PENDING: preference page

Refine the default Match result

If you are happy with most of what the default behavior does, but would like to refine some of it, you can do so by post-processing the result of the match phase. The original models are only used when matching, and will never be queried again afterwards. All remaining phases are incremental refinings of the "Comparison" model that's been created by the matching phase.

As such, you can impact all of the differencing process through this. Within this post-processing implementation, you can:

Defining a custom post-processor requires you to implement IPostProcessor and registering this sub-class against EMF Compare. The latter can be done via either an extension point, in which case it will be considered for all comparisons on models that match its enablement, or programmatically if you only want it active for your own actions:

Through code

The following registers a post-processor for all UML models. This post-processor will not be triggered if there are no UML models (matching the given namespace URI) within the compared scope.

IPostProcessor customPostProcessor = new CustomPostProcessor();
IPostProcessor.Descriptor descriptor = new BasicPostProcessorDescriptorImpl(customPostProcessor, Pattern.compile("http://www.eclipse.org/uml2/\\d\\.0\\.0/UML"), null);

PostProcessor.Registry registry = new PostProcessorDescriptorRegistryImpl();
registry.put(CustomPostProcessor.class.getName(), descriptor);
Comparison comparison = EMFCompare.builder().setPostProcessorRegistry(registry).build().compare(scope);

Through extension point

This accomplishes the exact same task, but it registers the post-processor globally. Any comparison through EMF Compare on a scope that contains models matching the given namespace URI will trigger that post-processor.

<extension point="org.eclipse.emf.compare.rcp.postProcessor">
      <postProcessor class="my.package.CustomPostProcessor">
         <nsURI value="http://www.eclipse.org/uml2/\\d\\.0\\.0/UML">
         </nsURI>
      </postProcessor>

Diff

Now that the Matching phase has completed and that we know how our objects are coupled together, EMF Compare no longer requires the two (or three) input models. It will no longer iterate over them or the comparison's input scope. From this point onward, only the result of our comparison, the Comparison object, will be refined through the successive remaining phases, starting by the Diff.

The goal of this phase is to iterate over all of our Match elements, be they unmatched (only one side has this object), couples (two of the three sides contain this object) or trios (all three sides have this object) and compute any difference that may appear between the sides. For example, an object that is only on one side of the comparison is an object that has been added, or deleted. But a couple might also represent a deletion: during three way comparisons, if we have an object in the common ancestor (origin) and in the left side, but not in the right side, then it has been deleted from the right version. However, this latter example might also be a conflict: we have determined that the object has been removed from the right side... but there might also be differences between the original version and the "left" version.

The differencing phase does not care about conflicts though: all it does is refine the comparison to tell that this particular Match has n diffs: one DELETE difference on the right side, and n differences on the left. Detecting conflicts between these differences will come at a later time, during the conflict resolution phase.

Customizations of this phase usually aim at ignoring specific differences.

Overriding the Diff engine

As is the case for the Match phase, the most powerful customization you can implement for the differencing process is to override the diff engine EMF Compare uses. To this end you can either [ http://git.eclipse.org/c/emfcompare/org.eclipse.emf.compare.git/tree/plugins/org.eclipse.emf.compare/src/org/eclipse/emf/compare/diff/IDiffEngine.java implement the whole contract, IDiffEngine], in which case you will have to carefully follow the javadoc's recommandations, or extend the [ http://git.eclipse.org/c/emfcompare/org.eclipse.emf.compare.git/tree/plugins/org.eclipse.emf.compare/src/org/eclipse/emf/compare/diff/DefaultDiffEngine.java default implementation, DefaultDiffEngine].

A custom diff engine can then be used for your comparisons:

IDiffEngine customDiffEngine = new MyDiffEngine(...);
EMFCompare.builder().setDiffEngine(customDiffEngine).build().compare(scope);

Changing the FeatureFilter

One of the differencing engine's responsibilities is to iterate over all features of a given object in order to check for potential differences on its value(s). However, there are some features that we decide do ignore by default: derived features, transient features... or some features on which we would like to check for ordering changes even though they are marked as non-ordered.

The logic to determine whether a feature should be checked for differences has been extracted into its own class, and is quite easy to alter. For example, if you would like to ignore the name feature of your elements or never detect any ordering change:

IDiffProcessor diffProcessor = new DiffBuilder();
IDiffEngine diffEngine = new DefaultDiffEngine(diffProcessor) {
	@Override
	protected FeatureFilter createFeatureFilter() {
		return new FeatureFilter() {
			@Override
			protected boolean isIgnoredReference(Match match, EReference reference) {
				return reference == EcorePackage.Literals.ENAMED_ELEMENT__NAME ||
						super.isIgnoredReference(match, reference);
			}

			@Override
			public boolean checkForOrderingChanges(EStructuralFeature feature) {
				return false;
			}
		};
	}
};
EMFCompare.builder().setDiffEngine(diffEngine).build().compare(scope);

You could also change the diff processor to achieve a similar goal. The difference between the two approaches is that changing the FeatureFilter will ignore the structural feature altogether, whereas replacing the diff processor would let EMF Compare check the feature and detect that diff, but ignore the notification that there is a change.

Changing the Diff Processor

The diff engine browses over all of the objects that have been matched, and checks all of their features in order to check for changes between the two (or three) versions' feature values. When it detects a change, it delegates all of the corresponding information to its associated Diff Processor, which is in charge of actually creating the Diff object and appending it to the resulting Comparison.

Replacing the Diff Processor gives you a simple entry point to ignore some of the differences the default engine detects, or to slightly alter the Diff information. You might want to ignore the differences detected on some references for example. Or you might want to react to the detected diff without actually creating the Comparison model... The implementation is up to you. You can either reimplement the whole contract or extend the default implementation, [ http://git.eclipse.org/c/emfcompare/org.eclipse.emf.compare.git/tree/plugins/org.eclipse.emf.compare/src/org/eclipse/emf/compare/diff/DiffBuilder.java DiffBuilder]

Here is a simple example that provides EMF Compare with a diff processor that will ignore all differences detected on the "name" attribute of our objects, yet keep the default behavior for all other differences.

IDiffProcessor customDiffProcessor = new DiffBuilder() {
	@Override
	public void attributeChange(Match match, EAttribute attribute, Object value, DifferenceKind kind, DifferenceSource source) {
		if (attribute != EcorePackage.Literals.ENAMED_ELEMENT__NAME) {
			super.attributeChange(match, attribute, value, kind, source);
		}
	}
};

IDiffEngine diffEngine = new DefaultDiffEngine(customDiffProcessor);
EMFCompare.builder().setDiffEngine(diffEngine).build().compare(scope);

Refine the default Diff result

The last possibility offered by EMF Compare to alter the result of the differencing phase is to post-process it. The remaining comparison phases -equivalence detection, detection of dependencies between diffs and conflict detection- all use the result of the Diff engine and refine it even further. As such, all of these phases can be impacted through the refining of the Diff result.

Example uses of the post-processing would include:

The post-processor for the diff engine is implemented exactly in the same way as for the match engine post-processors (the interface and extension point are the same). Please refer to Refining the Match result.

Equivalences

Now that the Differencing phase has ended, we've computed all of the individual differences within the compared models. However, all of these differences are still isolated, we now need to determine if there are any connections between them.

An equivalence is one kind of potential connections between differences. For example, Ecore has a concept of eOpposite references, which be maintained in sync with one another. Modifying one of the two references will automatically update the other side of the opposition accordingly. Both the manual modification and the automatic update are considered as distinct modifications of the model when looking at it after the fact, resulting in two differences detected. However, merging any of these two differences will automatically merge the other one. Therefore, both are marked as being equivalent to each other.

Though that is an example with two, more than two differences can be considered equivalent with each other. When we merge one difference, all of the other diffs that have been marked as being equivalent to it will be marked as MERGED, though no actual work needs to be done to merge them : EMF will have automatically updated them when merging the first.

Do note that EMF Compare detects and understand two different kind of relations that could be considered "equivalences". Described above are plain equivalences, when merging one of the differences will automatically update the model in such a way that all other sides of the equivalence are redundant and automatically merged. However, equivalences might not always be that easy. Let's take for example the case of UML : UML has concepts of subset and superset references. Adding an object into a subset will automatically update the related superset so that it also contains the added element. However, adding that same element to the superset will _not_ automatically update the related subset.

This can be seen as a "directional" equivalence, where one difference D1 implies another D2, but is not implied by D2. Implications will be detected at the same time as the equivalences, but they do not use an Equivalence element, they are filled under the Diff#implies reference instead.

Refine the default equivalences

This phase does not offer as many customization options as the previous ones; though post-processing should be enough for most needs. All of the phases that come after the differencing are further refinements of the comparison model, totally independent from one another. From here, any client of the API can refine the comparison model any way he'd like to, even by removing all of the default results.

A few examples of customizations that could be made from here :

The post-processor for the equivalence detection engine is implemented exactly in the same way as for the match engine post-processors (the interface and extension point are the same). Please refer to Refining the Match result.

Requirements

A requirement will be used to deal with structural constraints (to prevent dangling references or objects without parent) and to insure the model integrity. A difference requires other ones if the merge of this one needs the other ones not to "break" the model. The merge of a difference involves the merge of the required differences. All these differences will be merged by EMF Compare. For example, the add of a reference requires the add of the referenced object and the add of the object containing this reference.

Change kind Reference kind to a graphical object   Requires:
ADD content   ADD of its container
  DELETE of the origin value on the same containment mono-valued reference
reference   ADD of the target object
  ADD of the source object e.g. The ADD of a reference to the target or source of an edge requires the ADD of the edge itself and the ADD of the target and source objects, the ADD of a reference to the semantic object from a graphical one requires the ADD of the graphical and semantic object.
DELETE content   DELETE of the outgoing references and contained objects
  DELETE/CHANGE of the incoming references
  MOVE of the contained objects
MOVE content   ADD of the new container of the object
  MOVE of the new container of the object
CHANGE reference permutation   ADD of the target object

Requirements can be added during a post-process.

Refinement

A refinement enables to group a set of unit differences into a macroscopic change.
A unit difference refines a macroscopic one if it belongs to this macroscopic change. In other words, a macroscopic change is refined by a set of unit differences.
The merge of a macroscopic change involves the merge of the refining differences. All these differences will be merged by EMF Compare.
The use of the refinement allows to improve (simplify) the reading of the comparison from a business viewpoint, to accelerate the manual merging for the end-user and to insure some consistency.
For example, the add of an association, in UML, is refined by the add of the UML Association itself but also the add of the UML properties with the update of references...

Refinement can be added during a post-process.

Conflicts

PENDING description of the phase, extensibility options (post-process)

Merging

Which references are followed during merging

  Merge from Left to Right Merge from Right to Left
Source = Left requires requiredBy
Source = Right requiredBy requires

PENDING how to provide custom mergers, override existing ones?

User Interface

PENDING customize display of custom differences, add custom menu entries, add groups, add filters, add export options, provide custom content viewer