-
Make sure you enclose your code in triple back ticks. Example:
use this code - notice the 3 " ` " enclosing the code block:
to render this:
~/.conda/envs/tf-gpu/lib/python3.6/multiprocessing/popen_fork.py in __init__(self, process_obj)
18 sys.stderr.flush()
19 self.returncode = None
---> 20 self._launch(process_obj)
21
22 def duplicate_for_child(self, fd):
~/.conda/envs/tf-gpu/lib/python3.6/multiprocessing/popen_fork.py in _launch(self, process_obj)
65 code = 1
66 parent_r, child_w = os.pipe()
---> 67 self.pid = os.fork()
68 if self.pid == 0:
69 try:
OSError: [Errno 12] Cannot allocate memory🔴 NOTE: Do NOT put your Jupyter Notebook under the /data/ directory! Here's the link for why.
The default location is under the dl1 folder, wherever you've cloned the repo on your GPU machine.
my example
(fastai) paperspace@psgyqmt1m:~$ ls
anaconda3 data downloads fastai- Paperspace:
/home/paperspace/fastai/courses/dl1 - AWS:
/home/ubuntu/fastai/courses/dl1
If you change the default location of your notebook, you'll need to update your .bashrc file. Add in the path to where you've cloned the fastai GitHub repo:
- for me, my notebooks are in a "projects" directory:
~/projects - my
fastairepo is cloned at the root level, so it is here:~/fastai
in the file .bashrc add this path:
export PYTHONPATH=$PYTHONPATH:~/fastai
Reminder: don't forget to run (or source) your .bashrc file:
- add path where fastai repo is to
.bashrc - save and exit
- source it:
source ~/.bashrc
Note that if you did pip install, you don't need to specify the path (as in option 2, or you don't need to put in the courses folder, as in option 1).
However, fastai is still being updated so there is a delay in library being available directly via pip.
Can try:
pip install https://github.com/fastai/fastai/archive/master.zip
my path
PATH = "/home/ubuntu/data/dogscats/"looking at my directory structure
!tree {PATH} -d/home/ubuntu/data/dogscats/
├── models
├── sample
│ ├── models
│ ├── tmp
│ ├── train
│ │ ├── cats
│ │ └── dogs
│ └── valid
│ ├── cats
│ └── dogs
├── test
├── train
│ ├── cats
│ └── dogs
└── valid
├── cats
└── dogsmodelsdirectory: created automaticallysampledirectory: you create this with a small sub-sample, for testing codetestdirectory: put any test data there if you have ittrain/testdirectory: you create these and separate the data using your own data sampletmpdirectory: if you have this, it was automatically created after running models- fastai / keras code automatically picks up the label of your categories based on your folders. Hence, in this example, the two labels are: dogs, cats
- not important, you can name them whatever you want
looking at file counts
# print number of files in each folder
print("training data: cats")
!ls -l {PATH}train/cats | grep ^[^dt] | wc -l
print("training data: dogs")
!ls -l {PATH}train/dogs | grep ^[^dt] | wc -l
print("validation data: cats")
!ls -l {PATH}valid/cats | grep ^[^dt] | wc -l
print("validation data: dogs")
!ls -l {PATH}valid/dogs | grep ^[^dt] | wc -l
print("test data")
!ls -l {PATH}test1 | grep ^[^dt] | wc -lmy output
training data: cats
11501
training data: dogs
11501
validation data: cats
1001
validation data: dogs
1001
test data
12501- can do
80/20(train/validation) - if you have or are creating a 'test' split, use for (train/validation/test):
- can do
80/15/5 - can do
70/20/10 - can do
60/20/20
- can do
Note: Depending on who the instructor is, they use various naming conventions:
- train/test and then validation for holdout data
- train/validation and then test for holdout data
It's important to understand that:
- in the case of train/test, the test set is used to test for generalization
- the holdout data is a second test set
Instructions on using scp command to transfer files from platforms
