logging - Archiving JSON events from RabbitMQ to S3 -
i have process publishing json events rabbitmq exchange. i'd attach consumer both groups events , archives them s3 in sensible, buffered, , compressed manner. specifically, given event {"id": "1a2b3c4d5e6f", "ts": 1439995475, ... }
, i'd end in @ s3 key looking %y/%m/%d/%h/1a2b3c4d5e6f.json.gz
, datetime components of string derived flooring timestamp time interval, e.g.:
new_ts = math.floor(event["ts"] / secs_per_hr) * secs_per_hr
this seems relatively common problem, while could, i'm disinclined write own solution. it's pretty messaging pattern, , indeed seems lot of log management technologies should able handle this.
specifically, looking @ logstash solution. seemed there relatively simple pipeline:
- rabbitmq input adapter
- json filter adds
ts
,id
tags - s3 output adapter buffers them , rolls them off. (https://www.elastic.co/guide/en/logstash/current/plugins-outputs-s3.html)
still, i'm confronted several issues i'm not savvy enough logstash address.
- to knowledge, there's no way logstash perform either timestamp math or more abstract alternative of grouping timestamp
- the s3 plugin provides no utilities more elaborate key name templating (which relatively important because data programmatically accessed)
- there doesn't seem way of adding compression s3 plugin's output
- most problematic of all, there doesn't seem logic merging files -- understandably, s3 plugin keeps life simple making sure keynames unique, avoiding merging.
am asking of logstash here, or there way can fill purposes? there third party solution addresses needs (it seems there should be!)? or have resort own devices here?
Comments
Post a Comment