Friday, May 29, 2020

Application troubleshooting guide continued...

Stream Store and analytics automates deployments of applications written using Flink.  This includes options for  

  • Authentication and authorization 

  • Stream store metrics 

  • High availability 

  • State and fault tolerance via state backends 

  • Configuration including memory configuration and  

  • All of the production readiness checklist, that includes: 

  • Setting an explicit max parallelism 

  • Setting UUID for all operators 

  • Choosing the right State backend 

  • Configuring high availability for job managers 

  • Debugging and monitoring: 

  • Flink Applications have support for extensible user metrics in addition to System metrics. Although not all of these might be exposed via the stream and store analytics user interface, the applications can register metric types such as Counters, guages, Histograms and meters. 

  • Log4j and Logback can be configured with the appropriate log levels to emit log entries for the appropriate operators. These logs can also be collected continuously as the system makes progress. 

  • The status and statistics of completed jobs that have been archived by the JobManager can be viewed via the HistoryServer after they are configured. The Flink user interface may have support for it. 

  • Since the Job graph involves multiple jobs, they can each be independently queried using the job id. 

  • Checkpoints can also be monitored although the stream store and analytics might not support it via the user interface. 

  • Checkpoints can be triggered and restorations can be performed. 

  • Backpressure can be detected. If a task is producing data faster than the downstream operators can consume, it will have a rating. 

  • There are REST apis’ available for monitoring from the ‘flink-runtime’ project and is hosted by the Dispatcher. The web dashboard for monitoring also shows this information. It is also possible to extend these APIs. 

  • Event time and watermarks are powerful features that enable applications to handle late events and out-of-order events so that the events remain sequenced. The Flink 

No comments:

Post a Comment