cfs drift and then all crash #406

swarm5 · 2021-06-28T18:23:51Z

swarm5
Jun 28, 2021

@whoenig @jpreiss Recently I did a test of a centralised drone formation using crazyswarm. 20 cfs were arranged in a square and flown around the field.
The positioning method we used: optitrack 40HZ output data, single point of tracking and positioning
The control command we used: cmdpostion command for position control throughout
However, during one of the tests there was a situation where the cfs drifted en masse and then crashed. I can confirm that this is a very unlikely event, as we have tested this dozens of times and it has only happened once. Can you help analyse the possible causes?
Here is the video link：https://youtu.be/uofjD1xKTuI

Let me clarify other things.
1, the scene in the video is just a camera field of view issue, in fact the cfs do not fly out of the scene.
2, there is a balloon on the left side of the scene, which is a prop in my other test subjects, but in a motion capture environment it causes about 15 false noise points
3, the program I run in crazyswarm has cf marker point loss judgement, when the master program receives less than 10 times location data in 2 seconds (same as rviz location data), then the cf is removed from the aircraft management list and no more location data is sent.
4、In the video, the performance of the cfs can be divided into two phenomena: one is drifting in the air, which should be the reason for not receiving the positioning data; the second is crashing and tumbling down, I guess it may be that the positioning data and position control instructions were received later, but due to drifting a large distance, the cmdpostion position control instruction step was too large and all the cfs crashed down.
5. Possible cause one: point cloud matching timeout. It could be that the balloon reflections introduced many false marker points, causing a point cloud matching timeout of about 2 seconds in the crazyswarm. So the cfs did not receive the positioning data resulting in drift, and finally a certain frame of positioning data was received and the cfs crashed and flew around. However, there is one thing that is not clear in this case. It is logical that the point cloud match did not succeed for 2 seconds, which means that the cfs did not receive the positioning data within 2 seconds, so the central control program should have removed all the aircraft from the flight list, and the aircraft should have failed to receive the command and crashed. (I also need to test whether the aircraft will appear to be flying sideways after the management list is removed)
6. Possible reason 2: communication blockage. There is also a possibility that rviz also receives the aircraft's positioning data in real time, but due to a brief blockage in pa communication, the central control program does not remove all cfs from the managed queue. But in this case, there is also a little explanation that if the communication is blocked, the cfs will also crash if they do not receive the cmdposition command from the HF.

Does anyone have any thoughts on this issue please?

whoenig · 2021-06-28T19:58:26Z

whoenig
Jun 28, 2021
Maintainer

The drifting is typically caused by loss of position information. This could be communication issues or tracking issues, but in almost all cases it is tracking-related. Our tracking code at https://github.com/USC-ACTLab/libobjecttracker/blob/master/src/object_tracker.cpp#L449-L528 removes CF that have not been updated in 0.5s (your point 3 - how did you get 2seconds? Did you adjust the value in the code?).

A dramatic crash like that is usually caused by completely wrong state or setpoint information (the controller has very high gains and tries to overcompensate or the EKF diverges). Getting a crash like that after drifting should be nearly impossible. One reason I can see is that you somehow hit https://github.com/USC-ACTLab/libobjecttracker/blob/master/src/object_tracker.cpp#L437-L438, and re-initialize the tracking mid-air. This sounds very dangerous and I am not sure why we have this functionality in the first place. Perhaps @jpreiss remembers? If you can reproduce the issue, I suggest removing the lastCalldt > 0.4 condition and check if that helps.

If you have the uSD card deck or the console output of the flight, that could also give an indication on what happened.

The spurious markers shouldn't be a big problem, if the ballon is far away. Since you don't see issues before the "drifting" stage, I don't think they are causing the problems.

7 replies

swarm5 Jul 2, 2021
Author

I think the essential question is: What causes the failure of tracking?
@jpreiss You seem to think that it is an error in the assignment algorithm. But the error message recorded in my ROS log is connection refused, would an error in the allocation algorithm cause this error message? I am not very familiar with the underlying ROS mechanism.

jpreiss Jul 2, 2021
Maintainer

@swarm5 sorry for the confusion, I am only responding to @whoenig's question about that line of code, not your overall issue.

whoenig Jul 2, 2021
Maintainer

@jpreiss Cool ideas! For the assignment, I actually have a paper for the incremental version of the problem at https://whoenig.github.io/publications/2018_AAMAS_a_Hoenig.pdf (Alg 2 and 3). However, I don't think the top-2 idea works well in the presence of spurious markers. For example, consider a case where a shiny surface (e.g., uSD-deck) is occasionally detected as a marker, then the top-2 assignments will have almost identical cost but at the same time both assignments are indeed valid. I do like the idea with the velocity, but I think we might need a dataset to really make sure we don't over-engineer something that otherwise causes harm. For the broadcasting part, I agree that the firmware should handle that (and it does pretty well - usually there is just a slow drift at least with the CF 2.1 with the improved IMU).

Good point on checking git - I think this should be removed. It was probably there for some purpose that I don't remember anymore and the sideeffects are simply too risky.

FYI, since this motion capture tracking code is very useful for other robotic applications, I just put a standalone ROS repo at https://github.com/whoenig/motion_capture_tracking.

@swarm5 None of your messages looked concerning to me. As you said, some of them indicated some issues inside ROS (which are "normal"?). The others were warnings that went away after just one mocap frame (which could have been caused by a temporary slowness of the PC or radio).

jpreiss Jul 2, 2021
Maintainer

OK, I see how the top-2 assigments method might fail. I deleted the velocity idea because I had the same concern, especially with velocity estimates necessarily being either noisy or delayed.

I guess there is a big literature on multi-object tracking we should consult...

swarm5 Jul 2, 2021
Author

Well, I thought I had located the reason why cfs was not receiving data for a while. Now I'm in an endless limbo again.

swarm5 · 2021-06-29T16:32:22Z

swarm5
Jun 29, 2021
Author

Situation added.

@whoenig My colleague told me that he had encountered a similar situation, which means we experienced a total of two such system crashes. He saw some other details, which I will describe as a supplement.
During his test, he was looking at the rivz interface, and suddenly all the cfs axes in rviz got stuck (within 1s as he recalled), and then rviz suddenly updated the positions of all the cfs, and the whole process felt like the tracking effect of the libobjecttracker algorithm created a one-second lag, and then all the planes then all of them crashed.

Problem analysis.

I think the two phenomena should be the same. From the video, basically it is because cfs did not receive the positioning data for about two seconds, thus creating a drift in the air, and then somehow the cfs received the positioning data again, and the target control position was too far away from the estimated position of the current state, thus causing cfs to roll over.

There are two questions that need to be addressed here.

(1) Why did the cfs not receive the positioning data for a long time?

I think, it has nothing to do with PA, because the cmdposition command is still being sent all the time, otherwise cfs will also crash because it can't receive the streaming command, which should be a problem related to Crazyswarm.
a. In the first possibility, the point cloud data is then lost. It is possible that there is a problem with the network transfer of the location data between optitrack and crazyswarm (but I am connected via wired connection). If you guys also guess that there might be a problem with this piece, I will use the program to record whether the communication between optitrack and crazyswarm will have long network fluctuations and packet loss during the day.
b. The second possible scenario, there is a lag in the process of marker point assignment. Could it be that the problem arises when there is a lag in the process of allocating the point cloud to the cfs?

(2) Why did the location data suddenly come back later?

a. If it is a network fluctuation between optitrack and crazyswarm, maybe the network transmission is back to smooth.
b. It is also possible that the objecttracker was initialized over the air as you said, or that the allocation algorithm suddenly returned to normal.

There are a few questions about the libobjecttracker program.

When initializing over the air, is it not the case that the minimum cost function is calculated between marker and the initial position of cfs, not the last reserved cf position:
https://github.com/USC-ACTLab/libobjecttracker/blob/f457a06916fcb822b7280c3aad7380937e51cc09/src/object_tracker.cpp#L399-L404
Why is there such an operation as re-initialize? It seems that I need to ask @jpreiss
How can reinitialize? Trigger condition for reinitialize: lastCalldt > 0.4
According to this part of the code, it seems that the variable 'lastCall' has not been updated for a long time. lastCall is updated by calling the function ObjectTracker::updatePosition. is it possible to assume that crazyswarm has not received the point cloud data from motive for about two seconds?
https:/ /github.com/USC-ACTLab/libobjecttracker/blob/master/src/object_tracker.cpp#L425-L438

1 reply

whoenig Jun 29, 2021
Maintainer

It is very unlikely that this is caused by a connection lag with Motive. However, a similar outage of the data can occur if the crazyflie_server is blocked by a function. In the current version, any non-broadcast communication with a crazyflie could cause that (for example, sending the goTo command to a individual Crazyflie). Do you have such calls in your code, or do you only use the cmdposition topic?

I think this re-initalize code in libobjecttracker should be removed. It was probably well-intended for cases where one wants to keep the crazyflie_server running even though one drone crashed (and lost tracking), but the risk of having it seems pretty high IMO.

swarm5 · 2021-06-29T18:56:05Z

swarm5
Jun 29, 2021
Author

In fact, I only use the cmdposition directive in all my programs. I didn't use any advanced instructions because I knew there was a problem switching between high-level instructions and low-level instructions.Including takeoff and landing orders, I used cmdposition to achieve. Is the connection between crazyswarm and motive via TCP? If you connect to UDP, you may also have data sticking. I agree that you delete the part of the reinitialization code. There's a lot of risk. I want to be able to locate the problem because I want to make the whole system more stable. This problem is a small probability of failure, I can actually install an sd card to log, but maybe I flew dozens of times and couldn't replicate the problem. However, I think I can test the positioning data layer of the system. I'm sure I'm just using cmdpostion for position control in my control program. I want to record the motive output data or objecttracker output data for 12 hours to see if there's an exception. I wonder if that's feasible. What else could cause the system to block? I think I'll do some tests.

…

---Original--- From: ***@***.***> Date: Wed, Jun 30, 2021 02:29 AM To: ***@***.***>; Cc: ***@***.******@***.***>; Subject: Re: [USC-ACTLab/crazyswarm] cfs drift and then all crash (#406) It is very unlikely that this is caused by a connection lag with Motive. However, a similar outage of the data can occur if the crazyflie_server is blocked by a function. In the current version, any non-broadcast communication with a crazyflie could cause that (for example, sending the goTo command to a individual Crazyflie). Do you have such calls in your code, or do you only use the cmdposition topic? I think this re-initalize code in libobjecttracker should be removed. It was probably well-intended for cases where one wants to keep the crazyflie_server running even though one drone crashed (and lost tracking), but the risk of having it seems pretty high IMO. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

1 reply

whoenig Jun 30, 2021
Maintainer

NatNet (Motive) uses UDP, see https://github.com/whoenig/NatNetSDKCrossplatform.

For your test, you could just leave the Crazyflies on the ground (i.e., don't fly) and check the ROS log messages in the morning. There will be warnings if there are any tracking-related issues, or latency issues.

swarm5 · 2021-07-01T17:03:56Z

swarm5
Jul 1, 2021
Author

@whoenig @jpreiss I just checked the ROS log file on my computer and compared the difference between the log of the incident and other logs. I found that there are two differences.
All the log files, I have uploaded:https://drive.google.com/drive/folders/1wM9sJAL8AUdaCBO6qZNMmT8S9ATdd4Aj?usp=sharing

1. Two consecutive error messages: `[Errno 111] Connection refused` appear in the ‘`master.log`’:

[rosmaster.threadpool][ERROR] 2021-06-24 09:06:56,193: Traceback (most recent call last):
File "/opt/ros/kinetic/lib/python2.7/dist-packages/rosmaster/threadpool.py", line 218, in run
result = cmd(*args)
File "/opt/ros/kinetic/lib/python2.7/dist-packages/rosmaster/master_api.py", line 210, in publisher_update_task
ret = xmlrpcapi(api).publisherUpdate('/master', topic, pub_uris)
File "/usr/lib/python2.7/xmlrpclib.py", line 1243, in call
return self.__send(self.__name, args)
File "/usr/lib/python2.7/xmlrpclib.py", line 1602, in __request
verbose=self.__verbose
File "/usr/lib/python2.7/xmlrpclib.py", line 1283, in request
return self.single_request(host, handler, request_body, verbose)
File "/usr/lib/python2.7/xmlrpclib.py", line 1311, in single_request
self.send_content(h, request_body)
File "/usr/lib/python2.7/xmlrpclib.py", line 1459, in send_content
connection.endheaders(request_body)
File "/usr/lib/python2.7/httplib.py", line 1082, in endheaders
self._send_output(message_body)
File "/usr/lib/python2.7/httplib.py", line 909, in _send_output
self.send(msg)
File "/usr/lib/python2.7/httplib.py", line 871, in send
self.connect()
File "/usr/lib/python2.7/httplib.py", line 848, in connect
self.timeout, self.source_address)
File "/usr/lib/python2.7/socket.py", line 575, in create_connection
raise err
error: [Errno 111] Connection refused
[rosmaster.threadpool][ERROR] 2021-06-24 09:06:56,193: Traceback (most recent call last):
File "/opt/ros/kinetic/lib/python2.7/dist-packages/rosmaster/threadpool.py", line 218, in run
result = cmd(*args)
File "/opt/ros/kinetic/lib/python2.7/dist-packages/rosmaster/master_api.py", line 210, in publisher_update_task
ret = xmlrpcapi(api).publisherUpdate('/master', topic, pub_uris)
File "/usr/lib/python2.7/xmlrpclib.py", line 1243, in call
return self.__send(self.__name, args)
File "/usr/lib/python2.7/xmlrpclib.py", line 1602, in __request
verbose=self.__verbose
File "/usr/lib/python2.7/xmlrpclib.py", line 1283, in request
return self.single_request(host, handler, request_body, verbose)
File "/usr/lib/python2.7/xmlrpclib.py", line 1311, in single_request
self.send_content(h, request_body)
File "/usr/lib/python2.7/xmlrpclib.py", line 1459, in send_content
connection.endheaders(request_body)
File "/usr/lib/python2.7/httplib.py", line 1082, in endheaders
self._send_output(message_body)
File "/usr/lib/python2.7/httplib.py", line 909, in _send_output
self.send(msg)
File "/usr/lib/python2.7/httplib.py", line 871, in send
self.connect()
File "/usr/lib/python2.7/httplib.py", line 848, in connect
self.timeout, self.source_address)
File "/usr/lib/python2.7/socket.py", line 575, in create_connection
raise err
error: [Errno 111] Connection refused

2. Multiple warnings of ‘`No updated pose for CF`’ appear in ‘`rosout.log`’

1624496797.847429897 WARN [/home/ss/crazyswarm/ros_ws/src/crazyswarm/src/crazyswarm_server.cpp:969(runFast) [topics: /rosout, /pointCloud, /tf] No updated pose for CF cf2 for 0.010171 s.
1624496797.847437042 WARN [/home/ss/crazyswarm/ros_ws/src/crazyswarm/src/crazyswarm_server.cpp:969(runFast) [topics: /rosout, /pointCloud, /tf] No updated pose for CF cf3 for 0.010171 s.
1624496797.847444103 WARN [/home/ss/crazyswarm/ros_ws/src/crazyswarm/src/crazyswarm_server.cpp:969(runFast) [topics: /rosout, /pointCloud, /tf] No updated pose for CF cf4 for 0.010171 s.
1624496797.847450957 WARN [/home/ss/crazyswarm/ros_ws/src/crazyswarm/src/crazyswarm_server.cpp:969(runFast) [topics: /rosout, /pointCloud, /tf] No updated pose for CF cf5 for 0.010171 s.
1624496797.847457799 WARN [/home/ss/crazyswarm/ros_ws/src/crazyswarm/src/crazyswarm_server.cpp:969(runFast) [topics: /rosout, /pointCloud, /tf] No updated pose for CF cf6 for 0.018882 s.
1624496797.847464692 WARN [/home/ss/crazyswarm/ros_ws/src/crazyswarm/src/crazyswarm_server.cpp:969(runFast) [topics: /rosout, /pointCloud, /tf] No updated pose for CF cf7 for 0.010171 s.
1624496797.847471998 WARN [/home/ss/crazyswarm/ros_ws/src/crazyswarm/src/crazyswarm_server.cpp:969(runFast) [topics: /rosout, /pointCloud, /tf] No updated pose for CF cf8 for 0.010171 s.
1624496797.847478973 WARN [/home/ss/crazyswarm/ros_ws/src/crazyswarm/src/crazyswarm_server.cpp:969(runFast) [topics: /rosout, /pointCloud, /tf] No updated pose for CF cf9 for 0.018882 s.
1624496797.847485878 WARN [/home/ss/crazyswarm/ros_ws/src/crazyswarm/src/crazyswarm_server.cpp:969(runFast) [topics: /rosout, /pointCloud, /tf] No updated pose for CF cf10 for 0.018882 s.
1624496797.847493039 WARN [/home/ss/crazyswarm/ros_ws/src/crazyswarm/src/crazyswarm_server.cpp:969(runFast) [topics: /rosout, /pointCloud, /tf] No updated pose for CF cf11 for 0.018882 s.
1624496797.847499826 WARN [/home/ss/crazyswarm/ros_ws/src/crazyswarm/src/crazyswarm_server.cpp:969(runFast) [topics: /rosout, /pointCloud, /tf] No updated pose for CF cf12 for 0.010171 s.
1624496797.847506703 WARN [/home/ss/crazyswarm/ros_ws/src/crazyswarm/src/crazyswarm_server.cpp:969(runFast) [topics: /rosout, /pointCloud, /tf] No updated pose for CF cf13 for 0.010171 s.
1624496797.847513418 WARN [/home/ss/crazyswarm/ros_ws/src/crazyswarm/src/crazyswarm_server.cpp:969(runFast) [topics: /rosout, /pointCloud, /tf] No updated pose for CF cf14 for 0.010171 s.
1624496797.847520409 WARN [/home/ss/crazyswarm/ros_ws/src/crazyswarm/src/crazyswarm_server.cpp:969(runFast) [topics: /rosout, /pointCloud, /tf] No updated pose for CF cf15 for 0.010171 s.
1624496797.847527409 WARN [/home/ss/crazyswarm/ros_ws/src/crazyswarm/src/crazyswarm_server.cpp:969(runFast) [topics: /rosout, /pointCloud, /tf] No updated pose for CF cf16 for 0.010171 s.
1624496797.847534333 WARN [/home/ss/crazyswarm/ros_ws/src/crazyswarm/src/crazyswarm_server.cpp:969(runFast) [topics: /rosout, /pointCloud, /tf] No updated pose for CF cf17 for 0.010171 s.
1624496797.847541223 WARN [/home/ss/crazyswarm/ros_ws/src/crazyswarm/src/crazyswarm_server.cpp:969(runFast) [topics: /rosout, /pointCloud, /tf] No updated pose for CF cf18 for 0.010171 s.
1624496797.847548408 WARN [/home/ss/crazyswarm/ros_ws/src/crazyswarm/src/crazyswarm_server.cpp:969(runFast) [topics: /rosout, /pointCloud, /tf] No updated pose for CF cf19 for 0.018882 s.
1624496797.847555585 WARN [/home/ss/crazyswarm/ros_ws/src/crazyswarm/src/crazyswarm_server.cpp:969(runFast) [topics: /rosout, /pointCloud, /tf] No updated pose for CF cf20 for 0.018882 s.
1624496797.847717920 WARN [/home/ss/crazyswarm/ros_ws/src/crazyswarm/src/crazyswarm_server.cpp:969(runFast) [topics: /rosout, /pointCloud, /tf] No updated pose for CF cf1 for 0.010463 s.
1624496797.847728639 WARN [/home/ss/crazyswarm/ros_ws/src/crazyswarm/src/crazyswarm_server.cpp:969(runFast) [topics: /rosout, /pointCloud, /tf] No updated pose for CF cf2 for 0.010463 s.
1624496797.847735991 WARN [/home/ss/crazyswarm/ros_ws/src/crazyswarm/src/crazyswarm_server.cpp:969(runFast) [topics: /rosout, /pointCloud, /tf] No updated pose for CF cf3 for 0.010463 s.
1624496797.847743116 WARN [/home/ss/crazyswarm/ros_ws/src/crazyswarm/src/crazyswarm_server.cpp:969(runFast) [topics: /rosout, /pointCloud, /tf] No updated pose for CF cf4 for 0.010463 s.
1624496797.847750001 WARN [/home/ss/crazyswarm/ros_ws/src/crazyswarm/src/crazyswarm_server.cpp:969(runFast) [topics: /rosout, /pointCloud, /tf] No updated pose for CF cf5 for 0.010463 s.
1624496797.847756955 WARN [/home/ss/crazyswarm/ros_ws/src/crazyswarm/src/crazyswarm_server.cpp:969(runFast) [topics: /rosout, /pointCloud, /tf] No updated pose for CF cf6 for 0.019174 s.
1624496797.847764823 WARN [/home/ss/crazyswarm/ros_ws/src/crazyswarm/src/crazyswarm_server.cpp:969(runFast) [topics: /rosout, /pointCloud, /tf] No updated pose for CF cf7 for 0.010463 s.
1624496797.847771853 WARN [/home/ss/crazyswarm/ros_ws/src/crazyswarm/src/crazyswarm_server.cpp:969(runFast) [topics: /rosout, /pointCloud, /tf] No updated pose for CF cf8 for 0.010463 s.
1624496797.847778995 WARN [/home/ss/crazyswarm/ros_ws/src/crazyswarm/src/crazyswarm_server.cpp:969(runFast) [topics: /rosout, /pointCloud, /tf] No updated pose for CF cf9 for 0.019174 s.
1624496797.847786111 WARN [/home/ss/crazyswarm/ros_ws/src/crazyswarm/src/crazyswarm_server.cpp:969(runFast) [topics: /rosout, /pointCloud, /tf] No updated pose for CF cf10 for 0.019174 s.
1624496797.847793009 WARN [/home/ss/crazyswarm/ros_ws/src/crazyswarm/src/crazyswarm_server.cpp:969(runFast) [topics: /rosout, /pointCloud, /tf] No updated pose for CF cf11 for 0.019174 s.
1624496797.847800090 WARN [/home/ss/crazyswarm/ros_ws/src/crazyswarm/src/crazyswarm_server.cpp:969(runFast) [topics: /rosout, /pointCloud, /tf] No updated pose for CF cf12 for 0.010463 s.
1624496797.847807212 WARN [/home/ss/crazyswarm/ros_ws/src/crazyswarm/src/crazyswarm_server.cpp:969(runFast) [topics: /rosout, /pointCloud, /tf] No updated pose for CF cf13 for 0.010463 s.
1624496797.847814129 WARN [/home/ss/crazyswarm/ros_ws/src/crazyswarm/src/crazyswarm_server.cpp:969(runFast) [topics: /rosout, /pointCloud, /tf] No updated pose for CF cf14 for 0.010463 s.
1624496797.847821305 WARN [/home/ss/crazyswarm/ros_ws/src/crazyswarm/src/crazyswarm_server.cpp:969(runFast) [topics: /rosout, /pointCloud, /tf] No updated pose for CF cf15 for 0.010463 s.
1624496797.847828000 WARN [/home/ss/crazyswarm/ros_ws/src/crazyswarm/src/crazyswarm_server.cpp:969(runFast) [topics: /rosout, /pointCloud, /tf] No updated pose for CF cf16 for 0.010463 s.
1624496797.847835023 WARN [/home/ss/crazyswarm/ros_ws/src/crazyswarm/src/crazyswarm_server.cpp:969(runFast) [topics: /rosout, /pointCloud, /tf] No updated pose for CF cf17 for 0.010463 s.
1624496797.847841941 WARN [/home/ss/crazyswarm/ros_ws/src/crazyswarm/src/crazyswarm_server.cpp:969(runFast) [topics: /rosout, /pointCloud, /tf] No updated pose for CF cf18 for 0.010463 s.
1624496797.847848975 WARN [/home/ss/crazyswarm/ros_ws/src/crazyswarm/src/crazyswarm_server.cpp:969(runFast) [topics: /rosout, /pointCloud, /tf] No updated pose for CF cf19 for 0.019174 s.
1624496797.847855670 WARN [/home/ss/crazyswarm/ros_ws/src/crazyswarm/src/crazyswarm_server.cpp:969(runFast) [topics: /rosout, /pointCloud, /tf] No updated pose for CF cf20 for 0.019174 s.

2 replies

swarm5 Jul 1, 2021
Author

In order to compare and analyze with other logs, I uploaded four test logs.

swarm5 Jul 1, 2021
Author

https://drive.google.com/drive/folders/14FlTd1Nwqa1iveMs8-RSdonr3WxWVMbi?usp=sharing
There are four consecutive test procedures in it, and the third time is the folder ‘e419fbbc-d487-11eb-924e-b42e99612c7c’ which is the log of the accident.
6/24 08:41 eaf1a956-d484-11eb-924e-b42e99612c7c
6/24 09:00 90c09a52-d487-11eb-924e-b42e99612c7c
6/24 09:02 e419fbbc-d487-11eb-924e-b42e99612c7c
6/24 09:08 826340bc-d488-11eb-924e-b42e99612c7c

swarm5 · 2021-07-01T17:51:10Z

swarm5
Jul 1, 2021
Author

I am not familiar with the ros mechanism. But read the context information of the error. I looked at this as if it was not a communication problem between motive and crazyswarm, but an internal communication problem with ros?

[rosmaster.master][INFO] 2021-06-24 09:06:56,158: -SERVICE [/cf12/notify_setpoints_stop] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,159: -SERVICE [/cf12/update_params] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,159: -SERVICE [/cf13/upload_trajectory] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,160: -SERVICE [/cf13/start_trajectory] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,163: -SERVICE [/cf13/takeoff] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,163: -SERVICE [/cf13/land] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,164: -SERVICE [/cf13/go_to] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,165: -SERVICE [/cf13/set_group_mask] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,166: -SERVICE [/cf13/notify_setpoints_stop] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,167: -SERVICE [/cf13/update_params] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,167: -SERVICE [/cf14/upload_trajectory] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,168: -SERVICE [/cf14/start_trajectory] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,168: -SERVICE [/cf14/takeoff] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,169: -SERVICE [/cf14/land] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,169: -SERVICE [/cf14/go_to] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,170: -SERVICE [/cf14/set_group_mask] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,170: -SERVICE [/cf14/notify_setpoints_stop] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,171: -SERVICE [/cf14/update_params] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,171: -SERVICE [/cf15/upload_trajectory] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,172: -SERVICE [/cf15/start_trajectory] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,172: -SERVICE [/cf15/takeoff] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,173: -SERVICE [/cf15/land] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,173: -SERVICE [/cf15/go_to] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,174: -SERVICE [/cf15/set_group_mask] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,174: -SERVICE [/cf15/notify_setpoints_stop] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,175: -SERVICE [/cf15/update_params] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,175: -SERVICE [/cf16/upload_trajectory] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,176: -SERVICE [/cf16/start_trajectory] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,176: -SERVICE [/cf16/takeoff] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,177: -SERVICE [/cf16/land] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,178: -SERVICE [/cf16/go_to] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,178: -SERVICE [/cf16/set_group_mask] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,179: -SERVICE [/cf16/notify_setpoints_stop] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,179: -SERVICE [/cf16/update_params] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,180: -SERVICE [/cf17/upload_trajectory] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,180: -SERVICE [/cf17/start_trajectory] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,181: -SERVICE [/cf17/takeoff] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,181: -SERVICE [/cf17/land] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,182: -SERVICE [/cf17/go_to] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,182: -SERVICE [/cf17/set_group_mask] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,183: -SERVICE [/cf17/notify_setpoints_stop] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,183: -SERVICE [/cf17/update_params] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,184: -SERVICE [/cf18/upload_trajectory] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,185: -SERVICE [/cf18/start_trajectory] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,185: -SERVICE [/cf18/takeoff] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,186: -SERVICE [/cf18/land] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,186: -SERVICE [/cf18/go_to] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,187: -SERVICE [/cf18/set_group_mask] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,187: -SERVICE [/cf18/notify_setpoints_stop] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,188: -SERVICE [/cf18/update_params] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,188: -SERVICE [/cf19/upload_trajectory] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,189: -SERVICE [/cf19/start_trajectory] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,189: -SERVICE [/cf19/takeoff] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,190: -SERVICE [/cf19/land] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,190: -SERVICE [/cf19/go_to] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,191: -SERVICE [/cf19/set_group_mask] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,192: -SERVICE [/cf19/notify_setpoints_stop] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,192: -SERVICE [/cf19/update_params] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.threadpool][ERROR] 2021-06-24 09:06:56,193: Traceback (most recent call last):
File "/opt/ros/kinetic/lib/python2.7/dist-packages/rosmaster/threadpool.py", line 218, in run
result = cmd(*args)
File "/opt/ros/kinetic/lib/python2.7/dist-packages/rosmaster/master_api.py", line 210, in publisher_update_task
ret = xmlrpcapi(api).publisherUpdate('/master', topic, pub_uris)
File "/usr/lib/python2.7/xmlrpclib.py", line 1243, in call
return self.__send(self.__name, args)
File "/usr/lib/python2.7/xmlrpclib.py", line 1602, in __request
verbose=self.__verbose
File "/usr/lib/python2.7/xmlrpclib.py", line 1283, in request
return self.single_request(host, handler, request_body, verbose)
File "/usr/lib/python2.7/xmlrpclib.py", line 1311, in single_request
self.send_content(h, request_body)
File "/usr/lib/python2.7/xmlrpclib.py", line 1459, in send_content
connection.endheaders(request_body)
File "/usr/lib/python2.7/httplib.py", line 1082, in endheaders
self._send_output(message_body)
File "/usr/lib/python2.7/httplib.py", line 909, in _send_output
self.send(msg)
File "/usr/lib/python2.7/httplib.py", line 871, in send
self.connect()
File "/usr/lib/python2.7/httplib.py", line 848, in connect
self.timeout, self.source_address)
File "/usr/lib/python2.7/socket.py", line 575, in create_connection
raise err
error: [Errno 111] Connection refused

[rosmaster.threadpool][ERROR] 2021-06-24 09:06:56,193: Traceback (most recent call last):
File "/opt/ros/kinetic/lib/python2.7/dist-packages/rosmaster/threadpool.py", line 218, in run
result = cmd(*args)
File "/opt/ros/kinetic/lib/python2.7/dist-packages/rosmaster/master_api.py", line 210, in publisher_update_task
ret = xmlrpcapi(api).publisherUpdate('/master', topic, pub_uris)
File "/usr/lib/python2.7/xmlrpclib.py", line 1243, in call
return self.__send(self.__name, args)
File "/usr/lib/python2.7/xmlrpclib.py", line 1602, in __request
verbose=self.__verbose
File "/usr/lib/python2.7/xmlrpclib.py", line 1283, in request
return self.single_request(host, handler, request_body, verbose)
File "/usr/lib/python2.7/xmlrpclib.py", line 1311, in single_request
self.send_content(h, request_body)
File "/usr/lib/python2.7/xmlrpclib.py", line 1459, in send_content
connection.endheaders(request_body)
File "/usr/lib/python2.7/httplib.py", line 1082, in endheaders
self._send_output(message_body)
File "/usr/lib/python2.7/httplib.py", line 909, in _send_output
self.send(msg)
File "/usr/lib/python2.7/httplib.py", line 871, in send
self.connect()
File "/usr/lib/python2.7/httplib.py", line 848, in connect
self.timeout, self.source_address)
File "/usr/lib/python2.7/socket.py", line 575, in create_connection
raise err
error: [Errno 111] Connection refused

[rosmaster.master][INFO] 2021-06-24 09:06:56,193: -SERVICE [/cf20/upload_trajectory] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,194: -SERVICE [/cf20/start_trajectory] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,194: -SERVICE [/cf20/takeoff] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,195: -SERVICE [/cf20/land] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,195: -SERVICE [/cf20/go_to] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,196: -SERVICE [/cf20/set_group_mask] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,197: -SERVICE [/cf20/notify_setpoints_stop] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,197: -SERVICE [/cf20/update_params] /crazyswarm_server rosrpc://localhost:49505
[rosmaster.master][INFO] 2021-06-24 09:06:56,424: -PUB [/rosout_agg] /rosout http://localhost:34699/
[rosmaster.master][INFO] 2021-06-24 09:06:56,425: -SUB [/rosout] /rosout http://localhost:34699/
[rosmaster.master][INFO] 2021-06-24 09:06:56,425: -SERVICE [/rosout/get_loggers] /rosout rosrpc://localhost:50451
[rosmaster.master][INFO] 2021-06-24 09:06:56,426: -SERVICE [/rosout/set_logger_level] /rosout rosrpc://localhost:50451

3 replies

jpreiss Jul 2, 2021
Maintainer

At what rate do you send cmdPosition? If you send it very fast, maybe it is somehow overloading ROS? Unfortunately I don't know a lot about ROS's internals either. I'm surprised that the ROS master is implemented in Python (if I understand the error message correctly).

swarm5 Jul 2, 2021
Author

The frequency of my two accidents, cmdpositon command once was 10HZ and once was 20HZ

jpreiss Jul 2, 2021
Maintainer

OK, that is not particularly fast. ROS should be able to handle it.

swarm5 · 2021-07-09T02:13:22Z

swarm5
Jul 9, 2021
Author

Hello, this is the ROS log from another one of my system crashes.
https://drive.google.com/drive/folders/1gdAcruZk4wo-VB7fIseDSNYbymwIoBjS?usp=sharing
I would like to ask you
1, a lot of error messages: "No updated pose for CF", because of what? crazyswarm system can't receive point cloud data.
2. cf1-20 are showing "all dynamic check failed for object'' error, what is it because of? All of my aircraft are running at speeds below 0.5m/s.

1 reply

whoenig Jul 15, 2021
Maintainer

I can't access the shared link.
1.) In general, for those message you have to look at the duration at the end. If it just says 'for 0.01 s' this should be uncritically (just means that a single frame was lost).
2.) It should specify which dynamic check failed. This is also only critical w.r.t. to 1.)

cfs drift and then all crash #406

Uh oh!

swarm5 Jun 28, 2021

Replies: 6 comments · 15 replies

Uh oh!

whoenig Jun 28, 2021 Maintainer

Uh oh!

swarm5 Jul 2, 2021 Author

Uh oh!

jpreiss Jul 2, 2021 Maintainer

Uh oh!

whoenig Jul 2, 2021 Maintainer

Uh oh!

jpreiss Jul 2, 2021 Maintainer

Uh oh!

swarm5 Jul 2, 2021 Author

Uh oh!

swarm5 Jun 29, 2021 Author

Situation added.

Problem analysis.

There are two questions that need to be addressed here.

(1) Why did the cfs not receive the positioning data for a long time?

(2) Why did the location data suddenly come back later?

There are a few questions about the libobjecttracker program.

Uh oh!

whoenig Jun 29, 2021 Maintainer

Uh oh!

swarm5 Jun 29, 2021 Author

Uh oh!

whoenig Jun 30, 2021 Maintainer

Uh oh!

swarm5 Jul 1, 2021 Author

1. Two consecutive error messages: [Errno 111] Connection refused appear in the ‘master.log’:

2. Multiple warnings of ‘No updated pose for CF’ appear in ‘rosout.log’

Uh oh!

swarm5 Jul 1, 2021 Author

Uh oh!

swarm5 Jul 1, 2021 Author

Uh oh!

swarm5 Jul 1, 2021 Author

Uh oh!

jpreiss Jul 2, 2021 Maintainer

Uh oh!

swarm5 Jul 2, 2021 Author

Uh oh!

jpreiss Jul 2, 2021 Maintainer

Uh oh!

swarm5 Jul 9, 2021 Author

Uh oh!

whoenig Jul 15, 2021 Maintainer

swarm5
Jun 28, 2021

Replies: 6 comments 15 replies

whoenig
Jun 28, 2021
Maintainer

swarm5 Jul 2, 2021
Author

jpreiss Jul 2, 2021
Maintainer

whoenig Jul 2, 2021
Maintainer

jpreiss Jul 2, 2021
Maintainer

swarm5 Jul 2, 2021
Author

swarm5
Jun 29, 2021
Author

whoenig Jun 29, 2021
Maintainer

swarm5
Jun 29, 2021
Author

whoenig Jun 30, 2021
Maintainer

swarm5
Jul 1, 2021
Author

1. Two consecutive error messages: `[Errno 111] Connection refused` appear in the ‘`master.log`’:

2. Multiple warnings of ‘`No updated pose for CF`’ appear in ‘`rosout.log`’

swarm5 Jul 1, 2021
Author

swarm5 Jul 1, 2021
Author

swarm5
Jul 1, 2021
Author

jpreiss Jul 2, 2021
Maintainer

swarm5 Jul 2, 2021
Author

jpreiss Jul 2, 2021
Maintainer

swarm5
Jul 9, 2021
Author

whoenig Jul 15, 2021
Maintainer