Tools relating to the Xylem Global Innovation Challenge on Urban Flood Prediction
In this Challenge, we aim to use predictive modelling to help Portland, Oregon residents predict and take pre-emptive action against floods.
You can access the Floodopedia live app here
This is an open-source project and we strive to continually improve the functionality of Floodopedia. Feel free to make a pull request or raise issues!
- Create a Python virtual environment (venv) by invoking the command on your Command Prompt Shell as follows
C:\>python -m venv C:\path\to\myenv - On your shell, run
pip install -r requirements.txt - [OPTIONAL] Download, install and run ngrok here if you want to make your locally run Flask servers accessible on the Internet.
- On your shell, run
python app.py - Go to localhost:5001 on your web browser. Alternatively, you can specify a different IP or port number on the
app.run(host='new-IP-address', port=<new-port-number>)method inapp.pyand visit new-IP-address: instead. - Press
Ctrl+Cto terminate server
- On your shell, run
python app.py - Open the ngrok shell and run
ngrok http 5001orngrok http <your-port-number> - You should see a temporary link on your shell and you can access Floodopedia via the displayed link.
- Press
Ctrl+Cto terminate server
- The decision classifiers used are Gage Height, Turbidity and Discharge.
- As flooding is an extreme and rare event, available USGS Data had weak correlations (<0.20) with flooding. However, Gage Height, Turbidity and Discharge had the strongest correlations with flooding
- If you wish to populate the dataset with newer data, you can pull raw values from the USGS Fanno Creek Website
- Modify flood
flood_prediction_model.pyto your liking and seralize your modified model as a pickle file by runningpickle.dump(<model-variable-name>, open('<your-filename>.pkl','wb')) - In
app.py, de-serialize<your-filename>.pklby takingpickle.load(open('<your-filename>.pkl', 'rb'))and you can now run the.predict()method of your model on a dataset
- Floodopedia runs on Python's
Flasklibrary and uses REST API to make requests and responses between and within webpages. - Floodopedia is designed in such a way that each refresh fetches new data from the USGS Fanno Creek website (if any).
- BeautifulSoup4 is used to scrape HTML text data from the USGS Fanno Creek Site. Floodopedia's design deliberately omits the use of a webdriver to bypass dependency issues on different machines and no external installation is needed.
- The variable
formatted_descriptionreturns data in the form of 'Most recent instantaneous value: 15.7 05-28-2021 02:00 PDT' - Regular expressions are used to format the scraped data.
re.findall(r'[\d\.\d]+', formatted_description)[0]returns raw data (i.e. 15.7), while(re.findall(r'((0[1-9]|1[0-2])\-(0[1-9]|1\d|2\d|3[01])\-(19|20)\d{2}\s\s([0-1]?[0-9]|2[0-3]):[0-5][0-9]\s([P][D][T])\s)$', formatted_description))[0][0]returns date, time and timezone (i.e. 05-28-2021 02:00 PDT)