Postgresql : Maintaining the database and knowing more about relation/impact with each other Autovacuum , backend_xmin, datfrozenxid

187 views Asked by Anuya Varde At 15 February 2023 at 16:59

I am newbie to Postgresql, My project is in financial transactions having a few tables with huge transaction data which will have frequent insert/update/delete on it.

Initially when I started, came across an error that auto-vacuum is not working properly and noticed that the database size has turned huge and was creating memory issues so the team had increased the RAM to 24GB RAM from 16GB and the hard disk is 324GB.

After going through the system and pg_log errors which were all related to vacuum not working properly and displaying below error:

TopMemoryContext: 61752 total in 8 blocks; 8400 free (10 chunks); 53352 used
  TopTransactionContext: 8192 total in 1 blocks; 7856 free (24 chunks); 336 used
  TOAST to main relid map: 57344 total in 3 blocks; 34480 free (11 chunks); 22864 used
  AV worker: 24576 total in 2 blocks; 17608 free (9 chunks); 6968 used
    Autovacuum Portal: 8192 total in 1 blocks; 8168 free (0 chunks); 24 used
      Vacuum: 8192 total in 1 blocks; 8128 free (0 chunks); 64 used
  Operator class cache: 8192 total in 1 blocks; 4864 free (0 chunks); 3328 used
  smgr relation table: 8192 total in 1 blocks; 2808 free (0 chunks); 5384 used
  TransactionAbortContext: 32768 total in 1 blocks; 32744 free (0 chunks); 24 used
  Portal hash: 8192 total in 1 blocks; 3904 free (0 chunks); 4288 used
  PortalMemory: 0 total in 0 blocks; 0 free (0 chunks); 0 used
  Relcache by OID: 8192 total in 1 blocks; 2848 free (0 chunks); 5344 used
  CacheMemoryContext: 516096 total in 6 blocks; 231840 free (1 chunks); 284256 used
    log_trace_data_date_time_index: 1024 total in 1 blocks; 584 free (0 chunks); 440 used
    log_trace_data_destination_conn_id_index: 1024 total in 1 blocks; 584 free (0 chunks); 440 used
    log_trace_data_source_conn_id_index: 1024 total in 1 blocks; 584 free (0 chunks); 440 used
    log_trace_data_trace_number_index: 1024 total in 1 blocks; 584 free (0 chunks); 440 used
    log_trace_data_reference_retrieval_number_index: 1024 total in 1 blocks; 584 free (0 chunks); 440 used....
    pg_database_datname_index: 1024 total in 1 blocks; 544 free (0 chunks); 480 used
    pg_replication_origin_roiident_index: 1024 total in 1 blocks; 584 free (0 chunks); 440 used
    pg_auth_members_role_member_index: 1024 total in 1 blocks; 512 free (0 chunks); 512 used
    pg_database_oid_index: 1024 total in 1 blocks; 544 free (0 chunks); 480 used
    pg_authid_rolname_index: 1024 total in 1 blocks; 584 free (0 chunks); 440 used
  WAL record construction: 49528 total in 2 blocks; 6872 free (0 chunks); 42656 used
  PrivateRefCount: 8192 total in 1 blocks; 5960 free (0 chunks); 2232 used
  MdSmgr: 8192 total in 1 blocks; 7448 free (0 chunks); 744 used
  LOCALLOCK hash: 8192 total in 1 blocks; 4928 free (0 chunks); 3264 used
  Timezones: 104064 total in 2 blocks; 5960 free (0 chunks); 98104 used
  ErrorContext: 8192 total in 1 blocks; 8168 free (0 chunks); 24 used
xxxx 00:00:03 EST ERROR:  out of memory
xxxx 00:00:03 EST DETAIL:  Failed on request of size 503316480.
xxxx 00:00:03 EST CONTEXT:  automatic vacuum of table "novus.log.trace_data"

I got to know that vacuum locks the database, but I was not sure of any other way to check out, so I tried running "vacuum full tablename" (on those 3 huge tables that has DML functions frequently) once and check if there is any change?

Output: Database size which had increased to 125GB after vacuum was reduced to 70GB and auto vacuum started working fine until few days ago. I am not sure if I did it right and now, I see some things related to oldest xmin in the log. As I am still going through various articles and understanding things in PGSQL, could you please help me understand a few things in a better way. And if I did that right or is there any other way to resolve this autovacuum error. Logs are displaying various types of errors and warnings now, also some servers have pgsql web version and I am not able to read that log file properly. Some errors displayed in pg_log files:

Checkpoints are occuring too frequently = > this was appearing and vanished later automatically (max_wal_size had tried toying with this value too)
could not rename temporary statistics file "pg_stat_tmp/global.tmp" to "pg_stat_tmp/global.stat": Permission denied
canceling autovacuum task

I would like to know more on

How to resolve bloating?
Even though I see autovacuum has started working, I am not sure if there are records that are actually getting deleted and releasing memory. Are the deleted transactions not reaching invisible/frozen state???

Are these all side effects of my initial changes that I tried.

Project: Database is on 2 servers and the transactions are saved in productiondatabase "spool" table and the transactions are later moved to supportserver's database "transaction" table. Now the issue is spool table is getting huge and transactions table is not able to keep up as the autovacuum was not working properly.

Original Q&A

TechQA.

Postgresql : Maintaining the database and knowing more about relation/impact with each other Autovacuum , backend_xmin, datfrozenxid

There are 0 answers

Related Questions in POSTGRESQL

Related Questions in ERROR-HANDLING

Related Questions in CONFIG

Related Questions in VACUUM

Related Questions in AUTOVACUUM

Popular Questions

Trending Questions