Wednesday, May 29, 2013

The web interface for the example mentioned in the previous post could be a simple list view with MVC framework. HTML5 and CSS can be used for the views. The stack trace bucket viewer application could be a visual tool to see and edit individual stack trace records read from dumps as well as a way to force retries by the producer to read the stack trace from the dump. The dump entries could carry an additional flag to denote the state such as new, in progress and completed and processed in that order. If the state is reverted, the processing is required. If there are no intermediary states required  such as for updates then the insertion and deletion of record suffices to trigger reprocessing. The producer service should watch for dump files and keep an association between the dump and the entry in the database. If the dump entry is not in the database, the dump is re-read. The lookup between the database and the dump for processing can be quick since the service could look up the dump based on path and filename.
The file watcher and the service bus are often used together. The service bus helps to queue the dumps for processing. It also helps with error conditions and retries. The queuing goes by other names as well such MSMQ and others. However depending on the workload, this may or may not be required. The benefits of queuing is that it can be processed asynchronously and enable retries. This can be handled by the service itself since it works on one file at a time.
The table for dumps read and processed can grow arbitrarily large as many different dumps are processed. Depending on the number of dumps processed in a day and the size of their metadata that we store, the table can grow large enough to require aging policy and archiving of older records. The archival can be batched to the start of every month and during maintenance window. The archival requires a table similar to the source, possibly in a different database than the live one. The archival stored procedure could read the records a few at a time from the source, insert into the destination and delete the copied from the source. If the source is not a single table but a set of related tables, the archival will do this step for every table in the order that inserts are allowed. The order of deletes will be in the reverse order since the constraints may need to be handled first. The insertion and deletes would not be expected to fail since we will select the records that are in the source but not in the destination. This way we will be in a good state between each incremental move of records. This helps when there is a large number of records that makes the stored procedure run long and become prone to interruptions or failures. The archival can resume from where it left off.
These services work with files and other windows resources so they may require that security is tightened and that dumps are handled only by a service account that has been authorized for read and writes on the folder. This security account may be different for production but may require full access to all folders and sub-folders. File handling exceptions often affect the success rate of such file based services. Internally, the same service account should be enabled access to the database where the parsed dump information is stored. Exceptions handled by the services could be logged or stored in the database. For the consumer side of the store the users will use their own credentials. Their actions can be authenticated and authorized. This way we can tell apart the changes made by either side.
Since the services and dependencies are hosted separately, they may have to tolerate connectivity failures. From an end to end perspective, the file IO operations could all be isolated and made local to the machine with the dumps while all subsequent processing is with the database.

 

No comments:

Post a Comment