- 
                Notifications
    
You must be signed in to change notification settings  - Fork 69
 
Description
Context
- 1 x log server (VM on Apple M2)
 - 2 x log client (VM on Apple M2)
 - Using systemd-journal-remote on server (http)
 - Using systemd-journal-upload on clients (http)
 
Testing Method
We first cleanup the server with:
rm /var/log/journal/remote/*
On each client we run (side by side / simultaneously:
for x in {1..10000}; do echo "EVENT $x" | logger ; doneSo we expect 20,000 records on the server, which we validate:
root@log-server:/home/ansible# journalctl -D /var/log/journal/remote/ --no-pager | grep -v Boot | wc -l
20000
I've tested this many times to confirm that the systemd-journal-remote/upload mechanism is OK.
The failing code
import select
from systemd import journal
 j = journal.Reader(path="/var/log/journal/remote")
 j.seek_tail()
 j.get_previous() # prevent wrap around
 p = select.poll()
 p.register(j, j.get_events())
 print("Starting....")
 event_counter = 0
 while p.poll():
      for x in j:
          event_counter += 1
      if j.process() != journal.APPEND:
          print("...WAITING....")
      else:
          print(f"Event counter: {event_counter}")First of all, I'm not sure if this code is right. If I've made a glaring mistake, I apologise in advance for this report.
That said, I get correct results if I run my '10K log generator command' on one client at a time.
Event counter: 9915
Event counter: 9933
Event counter: 9945
Event counter: 9950
Event counter: 9987
Event counter: 10000
If I run the command at the same time on each client at once, I would expect 20K records, but it gets nowhere near that:
Event counter: 10228
Event counter: 10228
Event counter: 10228
Event counter: 10228
Event counter: 10228
Event counter: 10228
Event counter: 10228
In this example I'm missing almost half of all expected events. Yet I can confirm on the log server with the 'journalctl' command that there are actually 20K log records stored on the server.
Even if I test with just 1K events per client and run that simultaneously it's also clearly visible:
Event counter: 1096
Event counter: 1141
Event counter: 1215
Event counter: 1430
Happy to run any additional tests if that helps.
If I made a huge mistake, please let me know.
Update 1
I've been testing this issue on virtual machines on my mac.
I've also a few physical x86 micro pc's and the issue is the same.
Rate-limiting the logging with sleep .01 doesn't impact the result.
Question, is python-systemd built to handle remote logging from multiple clients through polling?