PQS interface view bug and deployment model change in Canton 3.4
This announcement is for validator operators who use the Participant Query Store (PQS). It describes a fixed PQS bug. It is recommended for PQS users to upgrade to the latest PQS version 3.4.3. Note that the PQS Docker image deployment model has changed and this is described below.
You can retrieve the new PQS Docker image from europe-docker.pkg.dev/da-images/public/docker/participant-query-store:3.4.3. A UI to browse the available images is https://console.cloud.google.com/artifacts/docker/da-images/europe/public/docker%2Fparticipant-query-store. (the version is now stored as a tag)
Bug fix - DAR vetting race condition with interface views
As mentioned in the #cf-global-synchronizer-appdev channel, there were Splice 0.5.11 Daml model changes that manifested a bug. The workaround was to make the following configuration changes:
- in k8s, set the Helm value on the validator app maxVettingDelay to 0m
- in docker-compose, add to the validator container an environment variable
ADDITIONAL_CONFIG_MAX_VETTING_DELAYwith valuecanton.validator-apps.validator_backend.max-vetting-delay = 0m
This bug is now fixed in PQS version 3.4.3.
The conditions where the bug manifested was:
- A new package version introduces the implementation of a new interface.
- The new package version (i.e., DAR) is uploaded to a validator but is not vetted.
- A contract from this package is created based on the older version. PQS polls the ledger for the new interface views. However, the ledger cannot compute the view because the new package version has not been vetted yet. It returns an empty interface view to PQS.
The symptom of this case is that this error java.lang.ClassCastException: class scala.None$ cannot be cast to class java.lang.String was logged because the empty interface view cannot be converted to the expected view data type. This issue would cause PQS to restart.
With this fix, the unimplemented view will not cause PQS to crash – it will simply be ignored. If this happens, PQS will log a message with the text Ignored an interface view value for contractId.
Breaking Change to the PQS Docker Image Deployment Model
In the past, a single PQS Docker image could be used with multiple Canton versions. This required the deployment to specify a directory (using the Docker -workdir parameter) that matched the Canton version the Docker image was connecting to. This was confusing to users and also constrained how quickly bug fixes (like mentioned above) could be addressed.
To resolve this, the PQS Docker image packaging has been simplified. There is now a separate PQS Docker image per release (e.g., PQS version 3.4.3). Please remove the --workdir parameter as it is no longer needed. For example, this command line works now for PQS version 3.4.3: docker run -it europe-docker.pkg.dev/da-images/public/docker/participant-query-store:3.4.3 --version.