/>
Summary
Issue: After upgrading to data plane v1.1.32, GCP self-hosted deployments experience checksum mismatch errors in braintrust-api logs, causing traces to appear incomplete (spans stuck “in progress”), 500 Internal Server Error responses on the /logs3 endpoint, and generally unusable tracing.
Cause: Data plane v1.1.32 bundles AWS SDK v3.723+, which changed the default to enable CRC32 response checksum validation. GCP’s S3 compatibility layer does not return the checksum headers the SDK now expects, causing every object storage operation to fail with a checksum mismatch.
Resolution: Set two environment variables on braintrust-api to disable strict checksum validation, or upgrade to Helm chart v5.0.1+ which includes this fix automatically.
Symptoms
You may see one or more of the following after upgrading to data plane v1.1.32:
- Checksum mismatch errors in
braintrust-api logs:
Error: Checksum mismatch: expected "H4DoSA==" but received "JuEkhQ=="
in response header "x-amz-checksum-crc32c"
- Traces with child spans stuck “in progress” that never complete
500 Internal Server Error responses with "Service":"api" on the /logs3 endpoint
- Spans within traces appearing inconsistently or missing
Who is affected
This issue only affects deployments that meet all of these criteria:
- Running on GCP
- Using S3 compatibility mode for object storage
- Not using native GCS auth (
ENABLE_GCS_AUTH is not set to true)
- Upgraded to data plane v1.1.32
Resolution Steps
Option 1: Upgrade Helm chart to v5.0.1+ (recommended)
Step 1: Update your Helm chart version
Upgrade to Helm chart version 5.0.1 or later, which includes the fix automatically.
# In your Helm values or Terraform Helm release configuration
version = "5.0.1"
Step 2: Apply the upgrade
helm upgrade braintrust braintrust/braintrust -f values.yaml
Step 3: Verify the fix
Check braintrust-api logs to confirm the checksum mismatch errors have stopped. Send a test trace and verify that all spans complete successfully.
Option 2: Manually set environment variables
If you cannot upgrade the Helm chart immediately, set these two environment variables on the braintrust-api deployment:
Step 1: Add environment variables
Add the following to your braintrust-api configuration (via Helm extraEnvVars, Terraform, or your deployment manifest):
AWS_REQUEST_CHECKSUM_CALCULATION: "WHEN_REQUIRED"
AWS_RESPONSE_CHECKSUM_VALIDATION: "WHEN_REQUIRED"
Both variables are required. Setting only one will not fully resolve the issue.
Step 2: Restart the API pods
kubectl rollout restart deployment braintrust-api -n braintrust
Step 3: Verify the fix
Check braintrust-api logs for the checksum error. It should no longer appear. Send a test trace and verify all spans complete.
Do not roll back from v1.1.32 to an earlier data plane version. Database schema changes in v1.1.32 are not backward-compatible, and rolling back may cause additional data integrity issues.