Sunday, December 1, 2019

We were discussing Flink APIs.
The source for events may generate and maintain state for the events generated. This generator can event restore state from these snapshots. Restoring a state means simply using the last known timestamp from the snapshot. All events subsequent to the timestamp will then be processed from that timestamp. Each event has a timestamp that can be extracted with Flink’s AscendingTimestampExtractor and the snapshot is merely a timestamp. This allows all events to be processed from the last snapshot.   
A source may implement additional behavior such as restricting the maximum number of events per second and periodically sampled so that when the number of events have exceeded in a period, the producer sleeps for a duration of the time remaining between the current time and the expiry of the wait period. 
It should be noted that writing steams via connectors is facilitated by the store. However, this is not the only convention to send data to a store. For example, we have well-known protocols like S3 which are widely recognized and equally applicable to stream stores just as much as they are applied to object stores. 
By the same argument, data transfer can also occur over any proprietary REST based APIs and not just industry standard S3 ApisSimple http requests to post data to store is another way to allow applications to send data. This is also the method for popular technology stacks such as influxDBtelegrafchronograf to collect and transmit metrics data. Whether there are dedicated agents involved in relaying the data or the store itself accumulates the data directly over the wire, these are options to widen the audience for the store. 
Making it easy for audience who don’t have to code to send data is beneficial not only to the store but also to these folks who support and maintain  production level data stores because it gives them an easy way to do a dry run rather than have to go through development cyclesThe popularity of the store is also increased by the customer base 
Technically, it is not appropriate to encapsulate an Flink connector within the http request handler for data ingestion at the store. This API is far more generic than the upstream software used to send the data because the consumer of this REST API could be the user interface, a language specific SDK, or shell scripts that want to make curl requests. It is better for the rest API implementation to directly accept the raw message along with the destination and authorization. 
Implementation iupcoming n Pravega fork tree at https://github.com/ravibeta/pravega 

No comments:

Post a Comment