We had a VM behaving abnormally recently, and on investigation this was being caused by the disk being almost full.
On comparison with another system on the same type of VM, this one was using 96% of disk space and the other 13% so what’s going on?
I called out for help on Facebook, and Clinton Jackson, Benjamin Chennels-Webb and Aaron Falzon all came up with some good ideas, so let’s go through those first
- Are Call recordings turned on?
Go to Users/ Select a User/ Select Options/ Check Call Recording Settings - Check if you are caching too many firmware files
SSH into the Server and check /var/lib/3cxpbx/Instance1/Data/Http/Interface/provisioning/<instance>/firmware - Is Verbose logging enabled?
Go to Dashboard/ Activity log and click ‘Settings’ to check, requires a restart after changes
Sadly none of these things helped, so it’s time to dive deeper…
First, SSH into the Server
ssh root@yourIP
Change to root directory
cd /
Show me the biggest directories
du -hs * | sort -rh | head -5
Breaking down this command, it says ‘Estimate space usage, in human readable format (but only the totals), match anything and then pipe to sort and show me the top 5. If you want to get more exact, read the tutorial linked at the bottom.
16G var 921M usr 504M lib 70M boot 21M run
Well, we are looking for the big wins, so let’s go for /var
cd var
Execute the du command again
du -hs * | sort -rh | head -5
15G lib 651M cache 334M log 1.2M backups 16K spool
Well, /lib looks interesting
…So basically we are navigating around the operating system and issuing the above command to show the size of directories. What did we find?
4.2GB logs in /var/lib/3cxpbx/Instance1/Data/Logs
And 3.7Gb of that is in a folder called Logs/Backup/{date} with a date of yesterday.
What happened? Well, according to the 3CX website-
The 3CX Server Logs are made up of one bldef file and one or more blrec files. The bldef file is the file that contains information about the tags, and other index data. The blrec files are the files that hold the logs. Both files are required in order to read 3CX Logs. Note that one bldef file can be used to read multiple blrec files.
The 3CX logs rotate when they reach 50MB. If the Keep Backup option (in “3CX Management Console (Dashboard) → Activity Log → Settings”) is disabled, two blrec log files are kept – the current one and the previous one. If the Keep Backup setting is enabled, the older files will be moved to the backup folder. There is an option to keep backup of log files for X number of days. This affects how many 3CX Logs are kept in the backup.
And sure enough, the folder is full of 50mb log files. So something has caused the Server to go nuts, and produce a ton of logs. When each log reaches 50mb, it is shunted to a subfolder – BUT- if it producers many of these in a day there’s theoretically nothing to stop it from filling up the disk, which is what happened here. This is good from the point of view of threat detection of investigation, not so good for the Server.
Result?
The VM now has over 50% free space. We’ll need to watch it over the next few days, but this seems to be the answer. But what about the CAUSE of the logtastrophe, I hear you scream? Well, we also checked for indicators of compromise and found none- but we will definitely set up some more monitoring for this (and other) 3CX VMs
References
Thanks to Tecmint for this article about finding file and directory sizes
And thanks to the people of Facebook for the help!