|
| 1 | +--- |
| 2 | +title: TCIE 2.x Platform Administration Tips |
| 3 | +layout: en_enterprise |
| 4 | + |
| 5 | +--- |
| 6 | + |
| 7 | +This section collects FAQs and day-to-day Travis CI Enterprise (TCIE) Platform maintenance scripts and tools. |
| 8 | + |
| 9 | +Please connect to your Platform machine via SSH before getting started. |
| 10 | + |
| 11 | +## Platform Logs in TCIE 2.x |
| 12 | + |
| 13 | +On the Platform, you can find the main log file at |
| 14 | +`/var/travis/log/travis.log`. They are also symlinked to |
| 15 | +`/var/log/travis.log` for convenience. |
| 16 | + |
| 17 | +## Container and Console access in TCIE 2.x |
| 18 | + |
| 19 | +`travis bash`: This will get you into the running container on the |
| 20 | +Platform. |
| 21 | + |
| 22 | +`travis console`: This will get you into a Ruby IRB session on the |
| 23 | +Platform. |
| 24 | + |
| 25 | +## Cancel or Reset Jobs |
| 26 | + |
| 27 | +Occasionally, jobs can get stuck in a `queued` state on the worker. To cancel or |
| 28 | +reset a large number of jobs, please execute the following steps: |
| 29 | + |
| 30 | +**TCIE 2.x**: `$ travis console` *on Platform host* |
| 31 | + |
| 32 | +Then, run: |
| 33 | + |
| 34 | +``` |
| 35 | +>> stuck_jobs = Job.where(queue: 'builds.linux', state: 'queued').where('queued_at < NOW() - interval \'60 minutes\'').all |
| 36 | +>> # Cancels all stuck jobs |
| 37 | +>> stuck_jobs.each(&:cancel!) |
| 38 | +>> # Or reset them |
| 39 | +>> stuck_jobs.each(&:reset!) |
| 40 | +``` |
| 41 | + |
| 42 | +## Clear Redis Archive Queue (V2.1.7 and prior) |
| 43 | + |
| 44 | +In Enterprise releases before 2.1.7, jobs were enqueued in the archive queue |
| 45 | +for log aggregation. Currently, this feature is available only for the hosted |
| 46 | +versions of Travis CI. |
| 47 | + |
| 48 | +This results in the queue growing bigger and bigger but not getting worked |
| 49 | +off. Because of that, Redis' memory consumption increases over time and can |
| 50 | +lead to decreased performance of the whole platform. The solution is to clear the `archive` queue to free system resources. |
| 51 | + |
| 52 | +To clear it, please execute the following command: |
| 53 | + |
| 54 | +**TCIE 2.x**: `$ travis console` *on Platform host* |
| 55 | + |
| 56 | +Then, run: |
| 57 | + |
| 58 | +``` |
| 59 | +>> require 'sidekiq/api' |
| 60 | +>> Sidekiq::Queue.new('archive').clear |
| 61 | +``` |
| 62 | + |
| 63 | +## Reset the RabbitMQ Certificate in TCIE 2.x |
| 64 | + |
| 65 | +After an upgrade of Replicated 2.8.0 to a newer version, occasionally, the service |
| 66 | +restarts with the following error: |
| 67 | + |
| 68 | +``` |
| 69 | +$ docker inspect --format '{% raw %}{{.State.Error}}{% endraw %}' focused_yalow |
| 70 | +oci runtime error: container_linux.go:247: starting container process |
| 71 | +caused "process_linux.go:359: container init caused |
| 72 | +\"rootfs_linux.go:54: mounting |
| 73 | +\\\"/var/lib/replicated-operator/44c648980d1e4b1c5a97167046f32f11/etc/travis/ssl/rabbitmq.cert\\\" |
| 74 | +to rootfs |
| 75 | +\\\"/var/lib/docker/aufs/mnt/a00833d25e72b761e2a0e72b1015dd2b2f3a32cafd2851ba408b298f73b37d37\\\" |
| 76 | +at |
| 77 | +\\\"/var/lib/docker/aufs/mnt/a00833d25e72b761e2a0e72b1015dd2b2f3a32cafd2851ba408b298f73b37d37/etc/travis/ssl/rabbitmq.cert\\\" |
| 78 | +caused \\\"not a directory\\\"\"" |
| 79 | +: Are you trying to mount a directory onto a file (or vice-versa)? Check |
| 80 | +if the specified host path exists and is the expected type. |
| 81 | +``` |
| 82 | + |
| 83 | +To address this, remove the RabbitMQ cert from `/etc/travis/ssl/`: |
| 84 | + |
| 85 | +``` |
| 86 | +$ sudo rm -r /etc/travis/ssl/rabbitmq.cert |
| 87 | +``` |
| 88 | +After this, do a full reboot of the system, and everything should start properly again. |
| 89 | + |
| 90 | +## View Sidekiq Queue Statistics |
| 91 | + |
| 92 | +In the past, there have been reported cases where the system became unresponsive. It took quite a while until jobs were worked off or they weren't picked up. We found out that full Sidekiq queues often played a part in this. To get some insight, it helps to retrieve some basic statistics in the Ruby console: |
| 93 | + |
| 94 | +**TCIE 2.x**: `$ travis console` *on Platform host* |
| 95 | + |
| 96 | +Then, run: |
| 97 | + |
| 98 | +``` |
| 99 | + >> require 'sidekiq/api' |
| 100 | + => true |
| 101 | + >> stats = Sidekiq::Stats.new |
| 102 | + >> stats.queues |
| 103 | + => {"sync.low"=>315316, |
| 104 | + "archive"=>7900, |
| 105 | + "repo_sync"=>193, |
| 106 | + "webhook"=>0, |
| 107 | + "keen_events"=>0, |
| 108 | + "scheduler"=>0, |
| 109 | + "github_status"=>0, |
| 110 | + "build_requests"=>0, |
| 111 | + "build_restarts"=>0, |
| 112 | + "hub"=>0, |
| 113 | + "slack"=>0, |
| 114 | + "pusher"=>0, |
| 115 | + "pusher-live"=>0, |
| 116 | + "build_cancellations"=>0, |
| 117 | + "sync"=>0, |
| 118 | + "user_sync"=>0} |
| 119 | +``` |
| 120 | + |
| 121 | +## Uninstall Travis CI Enterprise 2.x |
| 122 | + |
| 123 | +If you wish to uninstall Travis CI Enterprise 2.x from your platform and worker |
| 124 | +machines, please follow the instructions below. You |
| 125 | +need to run the following commands on the platform machine in order. <small>(Instructions copied over |
| 126 | +from <a href="https://help.replicated.com/docs/native/customer-installations/installing-via-script/">Replicated</a>)</small> |
| 127 | + |
| 128 | +### With Ubuntu 16.04 as host operating system |
| 129 | + |
| 130 | +```sh |
| 131 | +sudo systemctl stop replicated |
| 132 | +sudo systemctl stop replicated-ui |
| 133 | +sudo systemctl stop replicated-operator |
| 134 | +sudo docker ps | grep "replicated" | awk '{print $1}' | xargs sudo docker stop |
| 135 | +sudo docker ps | grep "quay.io-travisci-te-main" | awk '{print $1}' | xargs sudo docker stop |
| 136 | +sudo docker rm -f replicated replicated-ui replicated-operator replicated-premkit replicated-statsd |
| 137 | +sudo docker images | grep "replicated" | awk '{print $3}' | xargs sudo docker rmi -f |
| 138 | +sudo docker images | grep "te-main" | awk '{print $3}' | xargs sudo docker rmi -f |
| 139 | +sudo rm -rf /var/lib/replicated* /etc/replicated* /etc/init/replicated* /etc/init.d/replicated* /etc/default/replicated* /var/log/upstart/replicated* /etc/systemd/system/replicated* |
| 140 | +``` |
| 141 | + |
| 142 | +On the worker machine, you need to run this command to remove travis-worker and all build images: |
| 143 | + |
| 144 | +```sh |
| 145 | +$ sudo docker images | grep travis | awk '{print $3}' | xargs sudo docker rmi -f |
| 146 | +``` |
| 147 | + |
| 148 | +### With Ubuntu 14.04 as host operating system |
| 149 | + |
| 150 | +```sh |
| 151 | +sudo service replicated stop |
| 152 | +sudo service replicated-ui stop |
| 153 | +sudo service replicated-operator stop |
| 154 | +sudo docker stop replicated-premkit |
| 155 | +sudo docker stop replicated-statsd |
| 156 | +sudo docker rm -f replicated replicated-ui replicated-operator replicated-premkit replicated-statsd |
| 157 | +sudo docker images | grep "quay\.io/replicated" | awk '{print $3}' | xargs sudo docker rmi -f |
| 158 | +sudo apt-get remove -y replicated replicated-ui replicated-operator |
| 159 | +sudo apt-get purge -y replicated replicated-ui replicated-operator |
| 160 | +sudo rm -rf /var/lib/replicated* /etc/replicated* /etc/init/replicated* /etc/init.d/replicated* /etc/default/replicated* /var/log/upstart/replicated* /etc/systemd/system/replicated* |
| 161 | +``` |
| 162 | + |
| 163 | +On the worker machine, you need to run this command to remove travis-worker: |
| 164 | + |
| 165 | +``` |
| 166 | +$ sudo apt-get autoremove travis-worker |
| 167 | +``` |
| 168 | + |
| 169 | +Additionally, please use the following command to clean up all Docker build images: |
| 170 | + |
| 171 | +``` |
| 172 | +$ sudo docker images | grep travis | awk '{print $3}' | xargs sudo docker rmi -f |
| 173 | +``` |
| 174 | + |
| 175 | +## Discover the Maximum Available Concurrency |
| 176 | + |
| 177 | +To find out how much concurrency is available in your Travis CI Enterprise setup: |
| 178 | + |
| 179 | +**TCIE 2.x**: connect to your platform machine via SSH and run `$ travis bash` |
| 180 | + |
| 181 | +Then, please run: |
| 182 | + |
| 183 | +``` |
| 184 | +root@te-main:/# rabbitmqctl list_consumers -p travis | grep builds.trusty | wc -l |
| 185 | +``` |
| 186 | + |
| 187 | +The number that's returned here is equal to the maximum number of concurrent jobs that are available. To adjust concurrency, please follow the instructions [here](/user/enterprise/worker-configuration/#configuring-the-number-of-concurrent-jobs) for each worker machine. |
| 188 | + |
| 189 | +## Discover how many Worker Machines are Connected |
| 190 | + |
| 191 | +If you wish to find out how many worker machines are currently connected, please follow these steps: |
| 192 | + |
| 193 | +**TCIE 2.x**: connect to your platform machine via SSH and run: `$ travis bash` |
| 194 | + |
| 195 | +Then, run: |
| 196 | + |
| 197 | +``` |
| 198 | +root@te-main:/# rabbitmqctl list_consumers -p travis | grep amq.gen- | wc -l |
| 199 | +``` |
| 200 | + |
| 201 | +If you need to boot more worker machines, please see our docs about [installing new worker machines](/user/enterprise/setting-up-travis-ci-enterprise/#2-setting-up-the-enterprise-worker-virtual-machine). |
| 202 | + |
| 203 | +## Create Data Directories backup in TCIE 2.x |
| 204 | +The data directories are located on the platform machine and are mounted into the Travis CI container. In these directories, you'll find files from RabbitMQ, Postgres, Slanger, Redis, and log files from the various applications inside the container. |
| 205 | + |
| 206 | +The files are located at `/var/travis` on the platform machine. Please run `sudo tar -czvf travis-enterprise-data-backup.tar.gz /var/travis` to create a compressed archive from this folder. After this has finished, copy this file off the machine to a secure location. |
| 207 | + |
| 208 | +## Migrate from GitHub Services to Webhooks |
| 209 | + |
| 210 | +Travis CI Enterprise initially used GitHub Services to connect your repositories with GitHub.com (or GitHub Enterprise). As of January 31st, 2019, [services have been disabled on github.com](https://developer.github.com/changes/2019-01-29-life-after-github-services/). Services will also be disabled on GitHub Enterprise starting with GitHub Enterprise v2.17.0. |
| 211 | + |
| 212 | +Starting with [Travis CI Enterprise v2.2.5](https://enterprise-changelog.travis-ci.com/release-2-2-5-77988), all repositories that are activated use [webhooks](https://developer.github.com/webhooks/) to connect and manage communication with GitHub.com/GitHub Enterprise. |
| 213 | + |
| 214 | +> Repositories activated before Travis CI Enterprise v2.2.5 may need to be updated. |
| 215 | +
|
| 216 | +To perform an automatic migration, please follow these steps: |
| 217 | + |
| 218 | +1. **TCIE 2.x only**: Open an SSH connection to the platform machine. |
| 219 | +2. Run the following command: |
| 220 | +``` |
| 221 | +travis bash -c ". /etc/profile; cd /usr/local/travis-api && ENV=production bundle exec ./bin/migrate-hooks <optional-year>" |
| 222 | +``` |
| 223 | +This will search for all active repositories still using GitHub Services and migrate them to webhooks instead. |
| 224 | + |
| 225 | +You can provide a year argument (e.g., `2017`) in the above command to only migrate repositories activated on Travis CI Enterprise during that year. |
| 226 | + |
| 227 | +If you have a large number of repositories activated on your Travis CI Enterprise installation, please run the migration several times, breaking it down per year. For example: |
| 228 | + |
| 229 | +``` |
| 230 | +travis bash -c ". /etc/profile; cd /usr/local/travis-api && ENV=production bundle exec ./bin/migrate-hooks 2019" |
| 231 | +travis bash -c ". /etc/profile; cd /usr/local/travis-api && ENV=production bundle exec ./bin/migrate-hooks 2018" |
| 232 | +travis bash -c ". /etc/profile; cd /usr/local/travis-api && ENV=production bundle exec ./bin/migrate-hooks 2017" |
| 233 | +``` |
| 234 | + |
| 235 | +## Contact Enterprise Support |
| 236 | + |
| 237 | +{{ site.data.snippets.contact_enterprise_support }} |
0 commit comments