Friday, January 3, 2020


Flink Streaming jobs appear to run for a long time because the stream has no bounds. The program that invokes the env.execute() is kicked off in detached mode. The other mode of execution is blocking mode and it does not apply to StreamExecutionEnvironment but only to LocalExecutionEnvironment. The job itself will either appear with status as started on success or appear as error on failure. The logs for the job execution will only be partial because the foreground disappears after making an asynchronous call.

The logs for the background will show all the activities performed after the invocation.

There are also a few ways to gather some counts programmatically. These include:

eventsRead.addSink( new SinkFunction<String> () {

Private int count;

@Override

Public void invoke(String value) throws Exception {

             count++;

             logger.error(“count = {}, valueRead ={}”, count, value) ;

           }

}) ;



And the other is with using iterative streams

IterativeStream it = eventsRead.iterate();

It.withFeedbackType(String. Class) ;

DataStream mapped =it.map( t - > { logger. Info(t) ; return t;}) ;

It.closeWith(mapped);

When a job is performed in detached mode, the job execution result is not available immediately. That result is only available when the Flink application program is run in blocking mode which is usually kit the case for streaming mode.

There are ways to sleep between reads and writes but the scheduling of the job occurs when the execute is called.  This sometimes makes it harder for the program to be debugged via the application logs but the jobManager has up to date logs.

Flink Applications can use ProcessFunction with streams. It is similar to FlatMap Function but handles all three : events, state and timers. The function applies to each and every event in the input stream. It also gives access to the Flink keyed state via the runtime Context. The timer allows changes in event time and processing time to be handled by the application. Both the timestamps and timerservice are available via the context object. This is only possible with process function on a keyed stream.

No comments:

Post a Comment