Skip to content

Commit f396818

Browse files
committed
updating READMEs
1 parent e6b1f16 commit f396818

File tree

3 files changed

+113
-27
lines changed

3 files changed

+113
-27
lines changed

README.md

Lines changed: 70 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -32,62 +32,70 @@ After installation, you should be able to run `shub` on the command line, withou
3232

3333
$ shub --help
3434
usage: shub [-h] [--image IMAGE] [--images IMAGES] [--debug]
35-
[--outfolder OUTFOLDER] [--package] [--tree] [--simtree]
36-
[--subtract] [--simcalc] [--size SIZE]
35+
[--outfolder OUTFOLDER] [--package] [--os] [--oscalc] [--tags]
36+
[--tree] [--simtree] [--subtract] [--simcalc] [--size SIZE]
3737

3838
Singularity Hub command line tool
3939

4040
optional arguments:
4141
-h, --help show this help message and exit
4242
--image IMAGE full path to singularity image (for use with --package
4343
and --tree)
44-
--images IMAGES images, separated by commas (for use with --simtree)
44+
--images IMAGES images, separated by commas (for use with --simtree
45+
and --subtract
4546
--debug use verbose logging to debug.
4647
--outfolder OUTFOLDER
4748
full path to folder for output, stays in tmp (or pwd)
4849
if not specified
4950
--package package a singularity container for singularity hub
51+
--os estimate the operating system of your container.
52+
--oscalc calculate similarity score for your container vs.
53+
docker library OS.
54+
--tags retrieve list of software tags for an image, itself
55+
minus it's base
5056
--tree view the guts of an singularity image (use --image)
5157
--simtree view common guts between two images (use --images)
52-
--subtract subtract one container image from the second to make
53-
a difference tree (use --images first,subtract)
58+
--subtract subtract one container image from the second to make a
59+
difference tree (use --images first,subtract)
5460
--simcalc calculate similarity (number) between images based on
5561
file contents.
5662
--size SIZE If using Docker or shub image, you can change size
5763
(default is 1024)
5864

5965

6066

61-
### Package your container
67+
### Classify your container
68+
Singularity python provides functions for quickly assessing the base operating system of your container, retrieving a list of software tags that are relevant when this base is subtracted, and getting similarity scores of your container to a library of base software.
6269

63-
A package is a zipped up file that contains the image, the singularity runscript as `runscript`, a `VERSION` file, and a list of files `files.txt` and folders `folders.txt` in the container.
70+
#### Estimate the OS
6471

65-
![img/singularity-package.png](img/singularity-package.png)
72+
You can do this on the command line as follows:
6673

67-
The example package can be [downloaded for inspection](http://www.vbmis.com/bmi/project/singularity/package_image/ubuntu:latest-2016-04-06.img.zip), as can the [image used to create it](http://www.vbmis.com/bmi/project/singularity/package_image/ubuntu:latest-2016-04-06.img). This is one of the drivers underlying [singularity hub](http://www.singularity-hub.org) (under development).
74+
shub --image docker://python:latest --os
75+
[sudo] password for vanessa
76+
Most similar OS found to be debian:7.11
77+
debian:7.11
6878

69-
- **files.txt** and **folders.txt**: are simple text file lists with paths in the container, and this choice is currently done to provide the rawest form of the container contents. These files also are used to generate interactive visualizations, and calculate similarity between containers.
70-
- **VERSION**: is a text file with one line, an md5 hash generated for the image when it was packaged.
71-
- **{{image}}.img**: is of course the original singularity container (usually a .img file)
79+
or to do this from within Python, see the [provided example](examples/classify_image/estimate_os.py). From within python, you can export the sudopw as the environmental variable "pancakes" and it won't need to ask. This is not ideal, but it's required for now since we are using Singularity to export the image. This will likely be changed soon.
7280

73-
First, go to where you have some images:
7481

75-
ls
76-
ubuntu.img
77-
82+
#### Get software tags
83+
Singularity Hub uses a simple algorithm to obtain a likely list of software that is important to your image. It assumes that (most) core installed software is in a folder called `bin`, and returns the list of these files with the estimated base image subtracted. You can do this as follows:
7884

79-
You can now use the `shub` command line tool to package your image. Note that you must have [singularity installed](https://singularityware.lbl.gov/install-linux), and depending on the function you use, you will likely need to use sudo. We can use the `--package` argument to package our image:
8085

81-
shub --image ubuntu.img --package
86+
shub --image docker://python:latest --tags
87+
8288

89+
We also provide an [example for Python](examples/classify_image/derive_tags.py). If you do this programatically, you can change the folder(s) that are included, meaning that you could get a custom list of software (eg, libraries in `lib`, or python packages in `site-packages`).
8390

84-
If no output folder is specified, the resulting image (named in the format `ubuntu.img.zip` will be output in the present working directory. You can also specify an output folder:
8591

86-
shub --image ubuntu.img --package --outfolder /tmp
92+
#### Compare to base OS
93+
If you want to get a complete list of scores for your image against a core set of latest [docker-os](singularity/analysis/packages/docker-os) images:
8794

88-
For the package command, you will need to put in your password to grant sudo priviledges, as packaging requires using the singularity `export` functionality.
95+
shub --image docker://python:latest --oscalc
8996

90-
For more details, and a walkthrough with sample data, please see [examples/package_image](examples/package_image)
97+
98+
or again see [this example](examples/classify_image/estimate_os.py) for doing this from within python.
9199

92100

93101
### View the inside of a container
@@ -114,6 +122,14 @@ An [interactive demo](https://singularityware.github.io/singularity-python/examp
114122

115123
### Visualize Containers
116124

125+
#### Container Similarity Clustering
126+
Do you have sets of containers or packages, and want to cluster them based on similarities?
127+
128+
![examples/package_tree/docker-os.png](examples/package_tree/docker-os.png)
129+
130+
We have examples for both deriving scores and producing plots like the above, see [examples/package_tree/docker-os.png](examples/package_tree/docker-os.png)
131+
132+
117133
#### Container Similarity Tree
118134

119135
![examples/similar_tree/simtree.png](examples/similar_tree/simtree.png)
@@ -185,6 +201,38 @@ and the same applies for specification of Docker images, as in the previous exam
185201

186202

187203

204+
### Package your container
205+
The driver of much of the above is the simple container package. A package is a zipped up file that contains the image, the singularity runscript as `runscript`, a `VERSION` file, and a list of files `files.txt` and folders `folders.txt` in the container.
206+
207+
![img/singularity-package.png](img/singularity-package.png)
208+
209+
The example package can be [downloaded for inspection](http://www.vbmis.com/bmi/project/singularity/package_image/ubuntu:latest-2016-04-06.img.zip), as can the [image used to create it](http://www.vbmis.com/bmi/project/singularity/package_image/ubuntu:latest-2016-04-06.img). This is one of the drivers underlying [singularity hub](http://www.singularity-hub.org) (under development).
210+
211+
- **files.txt** and **folders.txt**: are simple text file lists with paths in the container, and this choice is currently done to provide the rawest form of the container contents. These files also are used to generate interactive visualizations, and calculate similarity between containers.
212+
- **VERSION**: is a text file with one line, an md5 hash generated for the image when it was packaged.
213+
- **{{image}}.img**: is of course the original singularity container (usually a .img file)
214+
215+
First, go to where you have some images:
216+
217+
ls
218+
ubuntu.img
219+
220+
221+
You can now use the `shub` command line tool to package your image. Note that you must have [singularity installed](https://singularityware.lbl.gov/install-linux), and depending on the function you use, you will likely need to use sudo. We can use the `--package` argument to package our image:
222+
223+
shub --image ubuntu.img --package
224+
225+
226+
If no output folder is specified, the resulting image (named in the format `ubuntu.img.zip` will be output in the present working directory. You can also specify an output folder:
227+
228+
shub --image ubuntu.img --package --outfolder /tmp
229+
230+
For the package command, you will need to put in your password to grant sudo priviledges, as packaging requires using the singularity `export` functionality.
231+
232+
For more details, and a walkthrough with sample data, please see [examples/package_image](examples/package_image)
233+
234+
235+
188236
### Build your container
189237
More information coming soon.
190238

examples/package_tree/README.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,16 @@
11
# How similar are my operating systems?
22
A question that has spun out of one of my projects that I suspect would be useful in many applications but hasn't been fully explored is comparison of operating systems. If you think about it, for the last few decades we've generated many methods for comparing differences between files. We have md5 sums to make sure our downloads didn't poop out, and command line tools to quickly look for differences. We now have to take this up a level, because our new level of operation isn't on a single "file", it's on an entire operating system. It's not just your Mom's computer, it's a container-based thing (e.g., Docker or Singularity that contains a base OS plus additional libraries and packages and then the special sauce, the application or analysis that the container was birthed into existence to carry out. It's not good enough to have message storage places to dump these containers, we need simple and consistent methods to computationally compare them, organize them, and let us explore them.
33

4+
We have provided this simple method in Singularity Python, which can produce plots like the following
5+
6+
## Cluster Docker (Library) Images based on Base OS
7+
![docker-os.library](docker-os.png)
8+
9+
## Cluster Base OS Versions
10+
![docker-os.png](docker-os.png)
11+
12+
The derivation of the scores can be seen in [calculate_similarity.py](calculate_similarity.py), and the simple plot in [plot_similarity.py](plot_similarity.py).
13+
414

515
# Similarity of File Paths
616
When I think about it, an entire understanding of an "image" (or more generally, a computer or operating system) comes down to the programs installed, and files included. Yes, there might be various environmental variables, but I would hypothesize that the environmental variables found in an image have a rather strong correlation with the software installed, and we would do pretty well to understand the guts of an image from the body without the electricity flowing through it. This would need to be tested, but not quite yet.

singularity/scripts.py

Lines changed: 33 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,21 @@ def get_parser():
4040
help="package a singularity container for singularity hub",
4141
default=False, action='store_true')
4242

43+
# Does the user want to estimate the os?
44+
parser.add_argument('--os', dest="os",
45+
help="estimate the operating system of your container.",
46+
default=False, action='store_true')
47+
48+
# Does the user want to estimate the os?
49+
parser.add_argument('--oscalc', dest="oscalc",
50+
help="calculate similarity score for your container vs. docker library OS.",
51+
default=False, action='store_true')
52+
53+
# Does the user want to get tags for an image?
54+
parser.add_argument('--tags', dest="tags",
55+
help="retrieve list of software tags for an image, itself minus it's base",
56+
default=False, action='store_true')
57+
4358
# View the guts of a Singularity image
4459
parser.add_argument('--tree', dest='tree',
4560
help="view the guts of an singularity image (use --image)",
@@ -101,11 +116,6 @@ def main():
101116
# If we are given an image, ensure full path
102117
if args.image != None:
103118

104-
# Exit if the image cannot be found
105-
if os.path.exists(args.image) == False:
106-
print("Cannot find image. Exiting.")
107-
sys.exit(1)
108-
109119
image,existed = get_image(args.image,
110120
return_existed=True,
111121
size=args.size)
@@ -120,6 +130,24 @@ def main():
120130
make_tree(image)
121131
clean_up(image,existed)
122132

133+
# The user wants to estimate the os
134+
elif args.os == True:
135+
from singularity.analysis.classify import estimate_os
136+
estimated_os = estimate_os(container=image)
137+
print(estimated_os)
138+
139+
# The user wants to get a list of all os
140+
elif args.oscalc == True:
141+
from singularity.analysis.classify import estimate_os
142+
estimated_os = estimate_os(container=image,return_top=False)
143+
print(estimated_os["SCORE"].to_dict())
144+
145+
# The user wants to get a list of tags
146+
elif args.tags == True:
147+
from singularity.analysis.classify import get_tags
148+
tags = get_tags(container=image)
149+
print(tags)
150+
123151

124152
# The user wants to package the image
125153
elif args.package == True:

0 commit comments

Comments
 (0)