Thursday, January 9, 2020

The processFunction takes action on every event but the user decides on what the collector collects.  
There are four aspects of these timers: 
1) Timers work with KeyedStreams. 
2) Timers allow events to be de-duplicated. 
3) Timers are checkpointed 
4) Timers can be deleted. 
 
There is at most one timer per key. This results in deduplication of events. Multiple timers are registered for the same key or timestamp then only one will fire which will automatically de-duplicate timers for events. They are also persisted so the process function becomes fault-tolerant. Timers can also be deregistered. 

The ReduceFunction can be used to find the sum: 
keyedStream.reduce(new ReduceFunction<Integer>() { 
    @Override 
    public Integer reduce(Integer val1, Integer val2) 
    throws Exception { 
        return val1 + val2; 
    } 
}); 

This can also be done with processFunction method as well: 
eventsRead.process((s, context, collector) -> { 
        Logger.info(“processElement:s={}”, s); 
        collector.collect(1); 
}); 

When the Job execution is in detached mode, the results may not be immediately available. If the results are to be sought, it is better to write it to another stream. This stream can then be read by other means that don’t need an Flink job execution.  

The JobExecutionResult is available on env.execute()  These include the result from accumulators and counters. 
For example, a Histogram accumulator can be instantiated inside RichFlatMapFunction.  This would include 
Private IntCounter intCounter = new IntCounter(); 
GetRuntimeContext().addAccumulator(“counterName”, this.intCounter()); 
this.intCounter.add(1); 

No comments:

Post a Comment