You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

Motivation

Running on the edge side, it is common to encounter network connection failure. For rules which sink to external system, especially remote external system, it is important to cache the data during failures such as network disconnection and resend once reconnected.

Previously, eKuiper has some degree of support for sink caching. It provides a global configuration to switch cache; system/rule level configuration for memory cache serialization interval. However, the cache is only in memory and a copy to the db(mirror of the memory) and do not define clear resend strategy. It is actually part of the failure recovery mechanism (qos). In this proposal, the cache will be saved in both memory and disk so that the cache capacity becomes much bigger; it will also continuously detect failure recovery status and resend without restarting the rule.

Flow

The cache happens only in sink because that is the only place to send data outside eKuiper. Each sink can have its own cache mechanism configured. The flows in each sink is alike.

  1. Error detection: after sending failure, the sink should identify recoverable failure (network and so on) by returning a specific error type which will trigger the caching.
  2. Cache mechanism: memory cache store the earliest failed data.After exceeding the memory threshold, new cache store to disk. After exceeding the disk threshold, the memory cache as the earliest events will be dropped and read in the earliest page of cache in the disk.
  3. Resend strategy: when new data comes in (discuss how to trigger?), send the first data from the memory cache to detect network conditions. If successful, send all caches in order (mem + disk) with a defined interval.

Configuration

There wil be two levels of sink cache configuration. A global configuration in etc/kuiper.yaml to define the default behavior for all rules. And a rule sink level definition to override the default behaviors.

The configuration items:

  • enableCache: whether to enable sink cache.
  • memoryCacheSize: the number of messages to be cached in one page. The maximum cache size in memory is two pages. The memory will keep two section of pages: the first page of messages to be resent and the last messages which cannot fill a page yet. Once the page is full, it will be dumped to the disk.
  • diskCacheSize: the maximum number of messages to be cached in the disk. The disk cache is FIFO. If the disk cache is full, the earliest page of messages will be dropped.
  • resendInterval: the interval for resending the messages after failure recovered.

Implementation consideration

  • The disk storage will be sqlite. Each sink will have a sqlite table to save the cache.
  • Add cached count to the sink metrics



  • No labels