soup reports 2 Fails of 777 2.3.100 -> 2.3.110-20220407 hotfix #7752
-
Hi all, I had been running SO 2.3.100 on a centos-7 based grid (5 sensors, 1 search, 1 manager) and I rebooted first (by request at the CLI) and then made soup tonight, so I believe I got 2/3/110-20220407. soup of soup went fine, "run again" 2nd run ended with:
I reviewed soup.log, but I'm naive and don't know what I'm looking for. (soup has always just worked...) I did run soup again (I trust Doug and the team, a lot), but it just said There is the following about 1100 lines upstream, which seems relevant (sorry for the big blockquote, as I said, I don't know what I'm looking for....):
sudo salt-run jobs.lookup_jid 20220409001105008902 returns Which seems reassuring. There's about 10K lines above that until we have the previous "Summary for local" (Succeeded: 1 (changed=1), Failed: 0 FWIW)
FWIW, I can log into my manager node, there are alerts there for the last 24 hours, My Grafana tab has graphs, etc and sudo salt * so.status returns all green for all nodes. Am I just triggered unnecessarily? What else should I be looking at? Thanks. Larry |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 2 replies
-
I think we are all waiting to find out what the issue was with that last hotfix. |
Beta Was this translation helpful? Give feedback.
-
It looks like the agents were probably upgrading at the time that soup ran and were unable to talk to the salt master service. The verification at the end is just looking for the word "ERROR" even though some errors are not harmful. If all of your nodes are green and data is coming in then you are where you want to be. Salt will ensure the box is in the right "state" even if it takes multiple runs to get there. @Acewiza As far as what the hotfix addressed was specific to Ubuntu. We did update the version of salt for centos as well since it was related to a critical CVE. We did introduce a regression in soup for airgap where we were telling soup to update salt before the new repo files were copied to the airgap repo located on the manager. The core issue here was Saltstack pulled the salt package from their repo causing all new Ubuntu installs to fail. We had to release multiple hotfixes as we got different scenarios from our customers and community. To prevent this from happening again, we are now hosting a copy of the Ubuntu salt repo at the securityonion public repo. Normally we would not do a hotfix for salt and wait to upgrade in our next major release but we couldn't leave our Ubuntu users in a broken state. None of the containers or the actual states changed in the hotfix, just the delivery mechanism for the Ubuntu packages. |
Beta Was this translation helpful? Give feedback.
It looks like the agents were probably upgrading at the time that soup ran and were unable to talk to the salt master service. The verification at the end is just looking for the word "ERROR" even though some errors are not harmful. If all of your nodes are green and data is coming in then you are where you want to be. Salt will ensure the box is in the right "state" even if it takes multiple runs to get there.
@Acewiza As far as what the hotfix addressed was specific to Ubuntu. We did update the version of salt for centos as well since it was related to a critical CVE. We did introduce a regression in soup for airgap where we were telling soup to update salt before the new repo files wer…