Class Archiver<E>

java.lang.Object
org.jivesoftware.openfire.archive.Archiver<E>
All Implemented Interfaces:
Runnable

public abstract class Archiver<E> extends Object implements Runnable
An abstract runnable that adds to-be-archived data to the database. This implementation is designed to reduce the work load on the database, by batching work where possible, without severely delaying database writes. This implementation acts as a consumer (in context of the producer-consumer design pattern), where the queue that is used to relay work from both processes is passed as an argument to the constructor of this class. This mechanism should be used with care in a clustered setup: although the individual Archiver instances will guarantee that the database insertion order matches the order in which elements are provided to the instance (using the archive(Object) method), this is not the case in a clustered environment. As each cluster node will manage its own batched queue of data, the insertion order of the elements in persistent storage might no longer reflect the order in which the data was made available somewhere in the cluster. If data order is important, the data to be archived should have a value that can be used to order on, which must already be present when the object is provided to the archive(Object) method (as opposed to delaying setting this value to when the data is being persisted in the backend storage).
Author:
Guus der Kinderen, guus.der.kinderen@gmail.com
  • Constructor Details

    • Archiver

      protected Archiver(String id, int maxWorkQueueSize, Duration maxPurgeInterval, Duration gracePeriod)
      Instantiates a new archiver.
      Parameters:
      id - A unique identifier for this archiver.
      maxWorkQueueSize - Do not add more than this amount of queries in a batch.
      maxPurgeInterval - Do not delay longer than this amount before storing data in the database.
      gracePeriod - Maximum amount of milliseconds to wait for 'more' work to arrive, before committing the batch.
  • Method Details

    • archive

      public void archive(E data)
    • getId

      public String getId()
    • run

      public void run()
      Specified by:
      run in interface Runnable
    • stop

      public void stop()
    • availabilityETA

      public Duration availabilityETA(Instant instant)
      Returns an estimation on how long it takes for all data that arrived before a certain instant will have become available in the data store. When data is immediately available, 'zero', is returned; Beware: implementations are free to apply a low-effort mechanism to determine a non-zero estimate. Notably, an implementation can choose to not obtain ETAs from individual cluster nodes, when the local cluster node is reporting a non-zero ETA. However, a return value of 'zero' must be true for the entire cluster (and is, in effect, not an 'estimate', but a matter of fact. This method is intended to be used to determine if it's safe to construct an answer (based on database content) to a request for archived data. Such response should only be generated after all data that was queued before the request arrived has been written to the database. This method performs a cluster-wide check, unlike availabilityETAOnLocalNode(Instant).
      Parameters:
      instant - The timestamp of the data that should be available (cannot be null).
      Returns:
      A period of time, zero when the requested data is already available.
    • availabilityETAOnLocalNode

      public Duration availabilityETAOnLocalNode(Instant instant)
      Returns an estimation on how long it takes for all data that arrived before a certain instant will have become available in the data store. When data is immediately available, 'zero', is returned; This method is intended to be used to determine if it's safe to construct an answer (based on database content) to a request for archived data. Such response should only be generated after all data that was queued before the request arrived has been written to the database. This method performs a check on the local cluster-node only, unlike availabilityETA(Instant).
      Parameters:
      instant - The timestamp of the data that should be available (cannot be null).
      Returns:
      A period of time, zero when the requested data is already available.
    • getMaxWorkQueueSize

      public int getMaxWorkQueueSize()
    • setMaxWorkQueueSize

      public void setMaxWorkQueueSize(int maxWorkQueueSize)
    • getMaxPurgeInterval

      public Duration getMaxPurgeInterval()
    • setMaxPurgeInterval

      public void setMaxPurgeInterval(Duration maxPurgeInterval)
    • getGracePeriod

      public Duration getGracePeriod()
    • setGracePeriod

      public void setGracePeriod(Duration gracePeriod)
    • store

      protected abstract void store(List<E> batch)