Previous Topic: Expanding Distribution ListsNext Topic: De-duplication Database Partitions


De-duplication

When an email is sent to multiple recipients, it is often the case that recipient accounts are hosted across multiple Exchange servers or Domino servers. If, for example, these servers host their own journals, there will then be multiple copies of the original email, one in each journal on the recipient mailbox servers, plus another one in the journal on the sender's mailbox server.

The de-duplication of emails

De-duplication email flow:

  1. Single user. A user sends an email in the New York office to the 'All Marketing' distribution list.
  2. Journal mailboxes (Exchange or Domino). There are members of the distribution list in the New York, Boston and Toronto offices, each of which runs its own email server with its own journal. A copy of the email now exists on each journal server.
  3. De-duplication database. The Universal Adapter processes these emails via the de-duplication database, which stores a unique ID for each email. The ID is identical for each duplicate email, enabling the de-duplication database to filter out emails already processed by the Universal Adapter. The New York copy is processed first, and the other two copies are identified as duplicates.
  4. Email Archive. The Universal Adapter then sends a single copy of the email to its outputs.

In the previous example, an email is sent from a user in the New York office to the All Marketing distribution list. There are members of the distribution list in the New York, Boston and Toronto offices, and each of the offices runs its own Exchange or Domino server with its own journal. A copy of the email now exists on each of these journal Exchange or Domino servers.

The Universal Adapter processes these emails via the de-duplication database, which stores a unique ID for each email. The ID will be identical for each duplicate email, which enables the de-duplication database to filter out emails that have already been processed by the Universal Adapter. In this example, the New York copy is processed first, and the Boston and Toronto copies are identified as duplicates. The Universal Adapter therefore only sends a single copy of the e‑mail to its outputs. Without the Universal Adapter, three identical emails would be stored in the e‑mail archive, using up valuable storage space.