Saturday, September 16, 2023

Overwatch deployment issues and resolutions:

 


·        Issue #1) Parameter names have changed

The ETL_STORAGE_PREFIX used to point to the location  where the ETL Database and the consumer database were stored. However, since the underlying storage account is used for a wide variety of tasks including calculations and report generation, this has changed to STORAGE_PREFIX. Earlier, the value would typically be a dbfs file location or a /mnt/folder and this now allows values such as ‘abfss#container.storage_account’ convention for locating reports and deployment directories. The /mnt/folder is still the best route to go with Overwatch jobs although the use of mounts is being deprecated in databricks.

·        Issue #2) Location migrations with different versions of the Overwatch deployment notebook

Occasionally, the 70 version of the Overwatch deployment notebook is run before the 71 version and even the location specified for storage prefix might change as users become aware of the different ways in which the notebook deploys the schema. They are both independent, but the first location reflects what the hive_metastore will show. Although the table names remain the same between the notebook versions, the version consistency between the notebook, the databases and the dashboards is still a requirement.

·        Issue #3) Missing tables or the generic table or view not found error is encountered when using Overwatch

Even though the results from the notebook execution might appear to show that it was successful, there may be messages in there about the validations that were performed. A false value for any validation pass indicates that the database tables would not be as pristine as they would if all the rules were successful. Also, some of the executions do not create all the tables in the consumer database and therefore repeated runs of the deployment notebook are required whenever there are warnings or messages. If all warnings and errors are not removable, it is better to drop and recreate the databases.

·        Issue #4) There are stale entries for locations of the etl or consumer databases or there are intermittent errors when reading the data.

The location that was specified as a mount is only accessible by using a service account or a dbx connector. It is not using the same credentials as the logged in user. Access to the remote storage for the purposes of Overwatch must always maintain both the account and the access control. Switching between credentials will not help in this case. It is preferred that Overwatch continues to run with admin credentials while the data is accessed with the token for storage access.

·        Issue #5) DB Name is not unique or the locations do not match.

The primordial date must be specified in the form yyyy-MM-dd although the Excel function saves the date in a different format and while this may appear consistent to the user, the error manifests in different forms with complaints mostly about name and location. Specifying this correctly, making sure the validations pass and the databases are correctly created, helps smoothen out the Overwatch operations.

No comments:

Post a Comment