Class Archiver<E>

  • All Implemented Interfaces:
    Runnable

    public abstract class Archiver<E>
    extends Object
    implements Runnable
    An abstract runnable that adds to-be-archived data to the database. This implementation is designed to reduce the work load on the database, by batching work where possible, without severely delaying database writes. This implementation acts as a consumer (in context of the producer-consumer design pattern), where the queue that is used to relay work from both processes is passed as an argument to the constructor of this class. This mechanism should be used with care in a clustered setup: although the individual Archiver instances will guarantee that the database insertion order matches the order in which elements are provided to the instance (using the archive(Object) method), this is not the case in a clustered environment. As each cluster node will manage its own batched queue of data, the insertion order of the elements in persistent storage might no longer reflect the order in which the data was made available somewhere in the cluster. If data order is important, the data to be archived should have a value that can be used to order on, which must already be present when the object is provided to the archive(Object) method (as opposed to delaying setting this value to when the data is being persisted in the backend storage).
    Author:
    Guus der Kinderen, guus.der.kinderen@gmail.com
    • Constructor Detail

      • Archiver

        protected Archiver​(String id,
                           int maxWorkQueueSize,
                           Duration maxPurgeInterval,
                           Duration gracePeriod)
        Instantiates a new archiver.
        Parameters:
        id - A unique identifier for this archiver.
        maxWorkQueueSize - Do not add more than this amount of queries in a batch.
        maxPurgeInterval - Do not delay longer than this amount before storing data in the database.
        gracePeriod - Maximum amount of milliseconds to wait for 'more' work to arrive, before committing the batch.
    • Method Detail

      • archive

        public void archive​(E data)
      • getId

        public String getId()
      • run

        public void run()
        Specified by:
        run in interface Runnable
      • stop

        public void stop()
      • availabilityETA

        public Duration availabilityETA​(Instant instant)
        Returns an estimation on how long it takes for all data that arrived before a certain instant will have become available in the data store. When data is immediately available, 'zero', is returned; Beware: implementations are free to apply a low-effort mechanism to determine a non-zero estimate. Notably, an implementation can choose to not obtain ETAs from individual cluster nodes, when the local cluster node is reporting a non-zero ETA. However, a return value of 'zero' must be true for the entire cluster (and is, in effect, not an 'estimate', but a matter of fact. This method is intended to be used to determine if it's safe to construct an answer (based on database content) to a request for archived data. Such response should only be generated after all data that was queued before the request arrived has been written to the database. This method performs a cluster-wide check, unlike availabilityETAOnLocalNode(Instant).
        Parameters:
        instant - The timestamp of the data that should be available (cannot be null).
        Returns:
        A period of time, zero when the requested data is already available.
      • availabilityETAOnLocalNode

        public Duration availabilityETAOnLocalNode​(Instant instant)
        Returns an estimation on how long it takes for all data that arrived before a certain instant will have become available in the data store. When data is immediately available, 'zero', is returned; This method is intended to be used to determine if it's safe to construct an answer (based on database content) to a request for archived data. Such response should only be generated after all data that was queued before the request arrived has been written to the database. This method performs a check on the local cluster-node only, unlike availabilityETA(Instant).
        Parameters:
        instant - The timestamp of the data that should be available (cannot be null).
        Returns:
        A period of time, zero when the requested data is already available.
      • getMaxWorkQueueSize

        public int getMaxWorkQueueSize()
      • setMaxWorkQueueSize

        public void setMaxWorkQueueSize​(int maxWorkQueueSize)
      • getMaxPurgeInterval

        public Duration getMaxPurgeInterval()
      • setMaxPurgeInterval

        public void setMaxPurgeInterval​(Duration maxPurgeInterval)
      • getGracePeriod

        public Duration getGracePeriod()
      • setGracePeriod

        public void setGracePeriod​(Duration gracePeriod)
      • store

        protected abstract void store​(List<E> batch)