-
Notifications
You must be signed in to change notification settings - Fork 472
build.opensuse.org
We deploy using ansible, see ansible-obs for details.
We use Apache in combination with Passenger as application server.
Passenger has command line tool that you can use to get more information
passenger-status # shows an overviewpassenger-status --show=server # shows detailed informationand you can gracefully restart the Passenger threads (they stop once they processed the last request and new ones get booted) like this
touch tmp/restart.txt
The systemd unit for Apache is called apache2.
systemctl status|start|stop|restart apache2Put apache into maintenance mode
-
Put apache out of maintenance mode in
/etc/sysconfig/apache2APACHE_SERVER_FLAGS="STATUS MAINTENANCE" -
Restart apache
rcapache2 restart -
Do whatever you need to do to fix the problem.
-
Put apache out of maintenance mode in
/etc/sysconfig/apache2APACHE_SERVER_FLAGS="STATUS" -
Restart apache
rcapache2 restart
- Apache:
/srv/www/obs/api/log/apache_access.logand/srv/www/obs/api/log/error.log - Passenger:
/var/log/apache2/passenger_log
We are running the rails application as the user wwwrun. So whatever you want to do you should also do this as this user to avoid creating files/running services with the wrong permissions. For this just prepend
run_in_api rails console
pry which we use as rails console has a nice history feature. You can look at the log here:
/var/lib/wwwrun/.local/share/pry/pry_history
You can toggle admin rights for a user with
run_in_api rake user:toggle_admin_rights theusername
- Ruby on Rails:
/srv/www/obs/api/log/production.log - Ruby on Rails calling the backend:
/srv/www/obs/api/log/backend_access.log
I, [$TIMESTAMP #$PID] $LOG_LEVEL -- : [$REQUEST_ID] [$PID:$(Time.now - Thread.current[:timestamp_formatter_timestamp])] method=$METHOD path=$PATH format=$FORMAT controller=$CONTROLLER_NAME action=$ACTION_NAME status=$HTTP_STATUS allocations=$ALLOCATIONS¹ duration=$DURATION_MILISECONDS view=$VIEW_MILISECONDS db=$DATABASE_MILISECONDS params=$PARAMS host=$SOURCE_IP backend=$BACKEND_MILISECONDS user=$USER bot=$(voight_kampff result)
¹ https://api.rubyonrails.org/classes/ActiveSupport/Notifications/Event.html#method-i-allocations
From time to time we have some issues with the CSS/JS assets. If application.css or application.js are missing (you will notice it, when you see unusual errors in your javascript console, specially a 404 when trying to retrieve it) then there are probably more than one sprocket manifest in production. Go to the public folder, check which one comes from the package and delete the one which doesn't. After that reload the application.
cd public/assets
rpm -qf .sprockets-manifest*
rm .sprockets-manifest-$SOMEHASH.json
cd ../..
touch tmp/restart.txtFor delayed jobs, sphinx, postfix etc. there is are systemd units and a target. You can issue commands for all services...
systemctl stop obs-api-support.target...or on single units.
systemctl stop obs-sphinx.serviceTo make sure all units are running fine (display as active (running), in green)
systemctl list-dependencies obs-api-support.target
To get an overview about events/jobs, you can run:
run_in_api rails runner script/delayed_job_stats.rb
- searchd:
/srv/www/obs/api/log/production.searchd.logand/srv/www/obs/api/log/production.searchd.query.log - clockworkd:
/srv/www/obs/api/log/clockworkd.clock.output - Postfix:
/var/log/mail - systemd unit logs are in
journalctl -u $servicelikejournalctl -u obs-clockwork.service. You can also filter messages within a time range (either timestamp or placeholders like "yesterday"). Read up onman journalctl...journalctl -u $SERVICE --since now|today|yesterday|tomorrow --until YYYY-MM-DD HH:MM:SS
We also use a couple of services that are somewhere else in the network.
We push a lot of events and metrics to https://rabbit.opensuse.org/
The HTML frontend of that service shows you live events, so click that link to see if it's working in general.
When you see exceptions like SSL_connect SYSCALL returned=5 errno=0 state=unknown state this usually means that there is some issue with the RabbitMQ server / connection. The maintenance window of the RabbitMQ server is Thursday, 8:00am to 10:00am CET. This can also cause this issue.
Errbit could also report AMQ::Protocol::EmptyResponseError: Empty response received from the server. errors. This does not happen when the reference server is deployed, but when the rabbit machine is (regular security updates, for example).
We push exceptions to https://errbit.opensuse.org
To make sure that errbit is working run
run_in_api rake airbrake:testand it will send an airbrake event to our errbit.
Just use the standard MongoDB text search and crud operations.
ssh obs-errbit
cd /srv/www/vhosts/errbit
bin/rails c -e p
finder = Problem.for_apps(App.where(_id: '58c94e2eeb6526000a000000')).search('Event')
finder.each do |problem|
problem.destroy
endOBS packages are build whenever a PR get's merged to master. This might delay publishing of built packages. To prevent this, disable the OBS integration in GitHub:
- Go to settings of the OBS GitHub project
- Select the 'Integration & services' tab and click on 'Edit' in the OBS column
- Uncheck the 'Activate' checkbox and 'Update services'
- Once the deployment is done activate the checkbox again ;-)
Of course we deploy on Linux so you can use a lot of the cool tools/features it brings.
We are recording shell commands with timestamps in /root/.bash_history. You can look at it with in chronological order with
history | tr --squeeze-repeats " " | cut -d " " -f 3- | sort
journalctl -u sshd --utc -o json --since "6 hours ago" | ruby /root/who.rb
Grep the top 15 IPs requesting Webui::PackageController#view_file
grep 'controller=Webui::PackageController action=view_file' log/production.log |awk 'match($0,/\yhost=(\S+)/, arr) {print arr[1]}'|sort |uniq -c|sort -n|tail -n 15
Find the log entries for the first two minutes of the day and sort them by duration:
grep 'T00:00\|T00:01' log/production.log |awk 'match($0,/\yduration=(\S+)/, arr) {print arr[1]}'|sort |uniq -c|sort -n|tail -n 15
Find the requests performed by users who were not logged in and happening on the first minute of the day. Then sort them by duration and display the top 10. Both the duration and controller are displayed:
grep 'T00:00' log/production.log | grep 'user=_nobody_' | awk 'match($0,/controller=(\S+).*duration=(\S+)/, arr) {print arr[2], arr[1]}' | sort -nr | head -n10
Output example:
2067024.96 StatusProjectController
1269978.67 Webui::Packages::BinariesController
Grep the exceptions we saw in the log file (minus the boring ones we've filtered out).
grep -A 1 FATAL log/production.log |grep -v "ActiveRecord::RecordNotFound\|ActionController::RoutingError\|ActionController::InvalidAuthenticityToken\|ActionController::UnknownFormat\|ActiveRecord::RecordNotUnique"|grep -v FATAL|grep -v -- '--'
Grep the request UUID for the full backtrace.
For instance when some thread is taking a very long time or starts blocking the database you want to see what is happening now (threads running) instead of seeing what happened in the past (log entries).
passenger-status --show server|grep path
In production, some files can differ from what we expect to have after the package installation. Some of those files are permanently modified, as configuration files. But some others can be temporarily modified when dealing with monkey patches, for example.
It is helpful to verify the packages and discover which of the installed files differ.
rpm -V obs-api
Is something wrong in production? We have collected the most common issues and proposed actions and solutions in Production Squad Tips and Tricks.
The reference server uses a proxy (IDP) to handle users. Sometimes users have problems updating their email. These are the steps:
- Change the email in IDP.
- Verify the email in the account system entering your username into https://idp-portal.suse.com/univention/self-service/#page=verifyaccount (In a perfect world, this would be automatically triggered by the request to change emails, but it is not yet).
- Wait a few minutes to sync and then log into OBS for it to pick up the update.
- Development Environment Overview
- Development Environment Tips & Tricks
- Spec-Tips
- Code Style
- Rubocop
- Testing with VCR
- Test in kanku
- Authentication
- Authorization
- Autocomplete
- BS Requests
- Events
- ProjectLog
- Notifications
- Feature Toggles
- Build Results
- Attrib classes
- Flags
- The BackendPackage Cache
- Maintenance classes
- Cloud uploader
- Delayed Jobs
- Staging Workflow
- StatusHistory
- OBS API
- Owner Search
- Search
- Links
- Distributions
- Repository
- Data Migrations
- Package Versions
- next_rails
- Ruby Update
- Rails Profiling
- Remote Pairing Setup Guide
- Factory Dashboard
- osc
- Setup an OBS Development Environment on macOS
- Run OpenQA smoketest locally
- Responsive Guidelines
- Importing database dumps
- Problem Statement & Solution
- Kickoff New Stuff
- New Swagger API doc
- Documentation and Communication
- GitHub Actions
- Brakeman
- How to Introduce Software Design Patterns
- Query Objects
- Services
- View Components
- RFC: Core Components
- RFC: Decorator Pattern
- RFC: Backend models
- RFC: Hotwire Turbo Frames Pattern