Postgres transactions aborted #8330

simonphughes · 2022-03-16T13:21:24Z

simonphughes
Mar 16, 2022

I have hasura running in production against an PostgreSQL database which is hosted on an AWS Aurora Serverless cluster. The solution works really well in general with the platform scaling up resources on demand, and scaling back down in quiet times. What I am seeing almost daily now as the usage is ramping up is HTTP500 server errors coming from hasura.

I used the cloudformation script that is provided on github to create our environment which includes load balanacer at the front end, hasura running in 2 containers on a Fargate ECS cluster, and the serverless aurora cluster. The platform always records, and we only see a moment of downtime which so far I believe isn't affecting customers, but I would like to get a better understanding of what is going wrong.

In the cloudwatch logs, the first event out of the ordinary has the following info in it:
"internal": {
"statement": "BEGIN ISOLATION LEVEL REPEATABLE READ ",
"prepared": true,
"error": {
"exec_status": "FatalError",
"hint": null,
"message": "SET TRANSACTION ISOLATION LEVEL must be called before any query",
"status_code": "25001",
"description": null
},
"arguments": []
},

This is then followed by:
WARNING: there is already a transaction in progress

Then I get the following:
"internal": {
"statement": "ABORT",
"prepared": true,
"error": "cmd expected; tuples found",
"arguments": []
},

and finally this:
{
"type": "scheduled-trigger",
"timestamp": "2022-03-16T13:08:02.856+0000",
"level": "error",
"detail": {
"path": "$",
"error": "QErr {qePath = [], qeStatus = Status {statusCode = 500, statusMessage = "Internal Server Error"}, qeError = "postgres tx error", qeCode = postgres-error, qeInternal = Just (Object (fromList [("statement",String "BEGIN ISOLATION LEVEL REPEATABLE READ "),("prepared",Bool True),("error",Object (fromList [("exec_status",String "FatalError"),("hint",Null),("message",String "prepared statement \"5\" already exists"),("status_code",String "42P05"),("description",Null)])),("arguments",Array [])]))}",
"code": "unexpected"
}
}

After that, the AWS load balancer kills off the container hosting it, and spins a new one and carries on working.
Could this be related to scaling events on the aurora cluster blocking things? Any other suggestions?

Many thanks

drojkind · 2022-12-08T21:33:34Z

drojkind
Dec 8, 2022

Bumping this question as we have been noticing this same issue recently, was not an issue previously. Also using the same setup ECS setup with Aurora Serverless Cluster.

0 replies

ReganL · 2022-12-08T21:36:55Z

ReganL
Dec 8, 2022

Bumping this as well, I'm having a similar issue

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Postgres transactions aborted #8330

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Postgres transactions aborted #8330

Uh oh!

simonphughes Mar 16, 2022

Replies: 2 comments

Uh oh!

drojkind Dec 8, 2022

Uh oh!

ReganL Dec 8, 2022

simonphughes
Mar 16, 2022

drojkind
Dec 8, 2022

ReganL
Dec 8, 2022