I am newbie to Postgresql, My project is in financial transactions having a few tables with huge transaction data which will have frequent insert/update/delete on it.
Initially when I started, came across an error that auto-vacuum is not working properly and noticed that the database size has turned huge and was creating memory issues so the team had increased the RAM to 24GB RAM from 16GB and the hard disk is 324GB.
After going through the system and pg_log errors which were all related to vacuum not working properly and displaying below error:
TopMemoryContext: 61752 total in 8 blocks; 8400 free (10 chunks); 53352 used
TopTransactionContext: 8192 total in 1 blocks; 7856 free (24 chunks); 336 used
TOAST to main relid map: 57344 total in 3 blocks; 34480 free (11 chunks); 22864 used
AV worker: 24576 total in 2 blocks; 17608 free (9 chunks); 6968 used
Autovacuum Portal: 8192 total in 1 blocks; 8168 free (0 chunks); 24 used
Vacuum: 8192 total in 1 blocks; 8128 free (0 chunks); 64 used
Operator class cache: 8192 total in 1 blocks; 4864 free (0 chunks); 3328 used
smgr relation table: 8192 total in 1 blocks; 2808 free (0 chunks); 5384 used
TransactionAbortContext: 32768 total in 1 blocks; 32744 free (0 chunks); 24 used
Portal hash: 8192 total in 1 blocks; 3904 free (0 chunks); 4288 used
PortalMemory: 0 total in 0 blocks; 0 free (0 chunks); 0 used
Relcache by OID: 8192 total in 1 blocks; 2848 free (0 chunks); 5344 used
CacheMemoryContext: 516096 total in 6 blocks; 231840 free (1 chunks); 284256 used
log_trace_data_date_time_index: 1024 total in 1 blocks; 584 free (0 chunks); 440 used
log_trace_data_destination_conn_id_index: 1024 total in 1 blocks; 584 free (0 chunks); 440 used
log_trace_data_source_conn_id_index: 1024 total in 1 blocks; 584 free (0 chunks); 440 used
log_trace_data_trace_number_index: 1024 total in 1 blocks; 584 free (0 chunks); 440 used
log_trace_data_reference_retrieval_number_index: 1024 total in 1 blocks; 584 free (0 chunks); 440 used....
pg_database_datname_index: 1024 total in 1 blocks; 544 free (0 chunks); 480 used
pg_replication_origin_roiident_index: 1024 total in 1 blocks; 584 free (0 chunks); 440 used
pg_auth_members_role_member_index: 1024 total in 1 blocks; 512 free (0 chunks); 512 used
pg_database_oid_index: 1024 total in 1 blocks; 544 free (0 chunks); 480 used
pg_authid_rolname_index: 1024 total in 1 blocks; 584 free (0 chunks); 440 used
WAL record construction: 49528 total in 2 blocks; 6872 free (0 chunks); 42656 used
PrivateRefCount: 8192 total in 1 blocks; 5960 free (0 chunks); 2232 used
MdSmgr: 8192 total in 1 blocks; 7448 free (0 chunks); 744 used
LOCALLOCK hash: 8192 total in 1 blocks; 4928 free (0 chunks); 3264 used
Timezones: 104064 total in 2 blocks; 5960 free (0 chunks); 98104 used
ErrorContext: 8192 total in 1 blocks; 8168 free (0 chunks); 24 used
xxxx 00:00:03 EST ERROR: out of memory
xxxx 00:00:03 EST DETAIL: Failed on request of size 503316480.
xxxx 00:00:03 EST CONTEXT: automatic vacuum of table "novus.log.trace_data"
I got to know that vacuum locks the database, but I was not sure of any other way to check out, so I tried running "vacuum full tablename" (on those 3 huge tables that has DML functions frequently) once and check if there is any change?
Output: Database size which had increased to 125GB after vacuum was reduced to 70GB and auto vacuum started working fine until few days ago. I am not sure if I did it right and now, I see some things related to oldest xmin in the log. As I am still going through various articles and understanding things in PGSQL, could you please help me understand a few things in a better way. And if I did that right or is there any other way to resolve this autovacuum error. Logs are displaying various types of errors and warnings now, also some servers have pgsql web version and I am not able to read that log file properly. Some errors displayed in pg_log files:
- Checkpoints are occuring too frequently = > this was appearing and vanished later automatically (max_wal_size had tried toying with this value too)
- could not rename temporary statistics file "pg_stat_tmp/global.tmp" to "pg_stat_tmp/global.stat": Permission denied
- canceling autovacuum task
I would like to know more on
- How to resolve bloating?
- Even though I see autovacuum has started working, I am not sure if there are records that are actually getting deleted and releasing memory. Are the deleted transactions not reaching invisible/frozen state???
Are these all side effects of my initial changes that I tried.
Project: Database is on 2 servers and the transactions are saved in productiondatabase "spool" table and the transactions are later moved to supportserver's database "transaction" table. Now the issue is spool table is getting huge and transactions table is not able to keep up as the autovacuum was not working properly.