Skip to content

Commit 2fcc28c

Browse files
committed
DOC - working on data package usecases
1 parent bb12721 commit 2fcc28c

File tree

2 files changed

+104
-88
lines changed

2 files changed

+104
-88
lines changed

doc/source/devel/data_pkg_discuss.rst

Lines changed: 20 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
.. _data-package-discuss:
22

3-
#############################
4-
Principles of data package NG
5-
#############################
3+
##########################
4+
Principles of data package
5+
##########################
66

77
**********
88
Motivation
@@ -36,7 +36,7 @@ separable.
3636
Package
3737
=======
3838

39-
This ideas is rather difficult to define, but is a bit like a data project, that
39+
This idea is rather difficult to define, but is a bit like a data project, that
4040
is a set of information that the packager believed had something in common. The
4141
package then is an abstract idea, and what is in the package could change
4242
completely over course of the life of the package. The package then is a little
@@ -59,14 +59,14 @@ moment, whether this has been committed or not.
5959

6060
It might not be enjoyable, but we'll call a package instantiation a *pinstance*.
6161

62-
Package instantiation revision
63-
==============================
62+
Pinstance revision
63+
==================
6464

6565
A revision is an instantiation of the working tree that has a unique label - the
6666
*revision id*.
6767

68-
Package revision id
69-
===================
68+
Pinstance revision id
69+
=====================
7070

7171
The *revision id* is a string that identifies a particular *pinstance*. This is
7272
the equivalent of the revision number in subversion_, or the commit hash in
@@ -76,8 +76,8 @@ example, you might have a revision of id '200', delete a file, restore the file,
7676
call this revision id '201', but they might both refer to the same instantiation
7777
of the package. Or they might not, that's up to you, the author of the package.
7878

79-
Package instantiation tag
80-
=========================
79+
Pinstance tag
80+
=============
8181

8282
A *tag* is a memorable string that refers to a particular pinstance. It differs
8383
from a revision id only in that there is not likely to be a tag for every
@@ -91,6 +91,14 @@ equivalent of applying a git tag). ``release-0.3`` is the tag and ``af5bd6``
9191
is the revision id. Different sources of the same package might possibly
9292
produce different tags [#tag-sources]_
9393

94+
Pinstance version
95+
=================
96+
97+
A *pinstance* might also have a version. A version is just a tag that can be
98+
compared using some algorithm.
99+
100+
.. _prundle:
101+
94102
Package provider bundle
95103
=======================
96104

@@ -127,6 +135,8 @@ A release might be a package instantiation that one person has:
127135
#. tagged
128136
#. made available as one or more *provider bundles*
129137

138+
.. _prundle-discovery:
139+
130140
Prundle discovery
131141
=================
132142

doc/source/devel/data_pkg_uses.rst

Lines changed: 84 additions & 78 deletions
Original file line numberDiff line numberDiff line change
@@ -10,65 +10,74 @@ Usecases
1010

1111
We are here working from :doc:`data_pkg_discuss`
1212

13-
Distribution types
14-
==================
13+
Prundles
14+
========
1515

16-
The most common type of distribution is a *file-system* distribution, that is,
17-
a distribution that can return content from passed filenames, as if it were a
18-
file-system. The examples below should make this clearer.
16+
See :ref:`prundle`.
1917

20-
An *local path* distribution is a file-system distribution with content stored
21-
in files in a directory accessible on the local filesystem. We could also call
22-
this a *local path* distribution type.
18+
An *local path* format prundle is a directory on the local file system with prundle data stored in files in a
19+
on the local filesystem.
2320

2421
Examples
2522
========
2623

27-
Create distribution
28-
-------------------
24+
We'll call our package `dang` - data package new generation.
2925

30-
::
26+
Create local-path prundle
27+
-------------------------
3128

32-
>>> import os
33-
>>> import dpkg
34-
>>> dst = dpkg.LocalPathDistribution.initialize(name='my-package', path=/a/path')
35-
>>> dst.name
36-
'my-package'
37-
>>> dst.version
38-
'0'
39-
>>> dst.path == os.path.abspath(/a/path')
40-
True
41-
>>> os.listdir(/a/path')
42-
['meta.ini']
29+
>>> import os
30+
>>> import tempfile
31+
>>> pth = tempfile.mkdtemp() # temporary directory
4332

44-
The local path distribution here is just the set of files in the ``a/path`` directory.
33+
Make a pinstance object
4534

46-
The call to the ``initialize`` class method above creates the directory if it
47-
does not exist, and writes a bare ``meta.ini`` file to the directory, with the
48-
given ``name``, and default version of ``0``.
35+
>>> from dang import Pinstance
36+
>>> pri = Prundle(name='my-package')
37+
>>> pri.pkg_name
38+
'my-package'
39+
>>> pri.meta
40+
{}
4941

50-
Use local path distribution
51-
---------------------------
42+
Now we make a prundle. First a directory to contain it
5243

53-
::
44+
>>> import os
45+
>>> import tempfile
46+
>>> pth = tempfile.mkdtemp() # temporary directory
5447

55-
>>> dst = dpkg.LocalPathDistribution.from_path(path=/a/path')
56-
>>> dst.name
57-
'my-package'
48+
>>> from dang.prundle import LocalPathPrundle
49+
>>> prun = LocalPathPrundle(pri, pth)
50+
51+
At the moment there's nothing in the directory. The 'write' method will write
52+
the meta information - here just the package name.
53+
54+
>>> prun.write() # writes meta.ini file
55+
>>> os.listdir(pth)
56+
['meta.ini']
57+
58+
The local path prundle data is just the set of files in the temporary directory
59+
named in ``pth`` above.
60+
61+
Now we've written the package, we can get it by a single call that reads in the
62+
``meta.ini`` file:
5863

59-
Getting content
60-
---------------
64+
>>> prun_back = LocalPathPrundle.from_path(pth)
65+
>>> prun_back.pkg_name
66+
'my-package'
6167

62-
The file-system distribution types can return content by file names.
68+
Getting prundle data
69+
--------------------
6370

64-
For example, for the local path ``dst`` distribution objects we have seen so
71+
The file-system prundle formats can return content by file names.
72+
73+
For example, for the local path ``prun`` distribution objects we have seen so
6574
far, the following should work::
6675

67-
>>> fobj = dst.get_fileobj('a_file.txt')
76+
>>> fobj = prun.get_fileobj('a_file.txt')
6877

6978
In fact, local path distribution objects also have a ``path`` attribute::
7079

71-
>>> fname = os.path.join(dst.path, 'a_file.txt')
80+
>>> fname = os.path.join(prun.path, 'a_file.txt')
7281

7382
The ``path`` attribute might not make sense for objects with greater abstraction
7483
over the file-system - for example objects encapsulating web content.
@@ -77,38 +86,39 @@ over the file-system - for example objects encapsulating web content.
7786
Discovery
7887
*********
7988

80-
So far we only have distribution objects. In order for a program to use a
81-
distribution object it has to know where the distribution is.
89+
So far, in order to create a prundle object, we have to know where the prundle
90+
is (the path).
91+
92+
We want to be able to tell the system where prundles are - and the system will
93+
then be able to return a prundle on request - perhaps by package name. The
94+
system here is answering a :ref:`prundle-discovery` query.
8295

83-
We will then want to ask our packaging system whether it knows about the
84-
distribution we are interested in. This is a *discovery query*.
96+
We will then want to ask our packaging system whether it knows about the prundle
97+
we are interested in.
8598

8699
Discovery sources
87100
=================
88101

89102
A discovery source is an object that can answer a discovery query.
90103
Specifically, it is an object with a ``discover`` method, like this::
91104

92-
>>> dsrc = dpkg.get_source('local-system')
105+
>>> import dang
106+
>>> dsrc = dang.get_source('local-system')
93107
>>> dquery_result = dsrc.discover('my-package', version='0')
94-
>>> dquery_result.distribution.name
108+
>>> dquery_result[0].pkg_name
95109
'my-package'
96110
>>> dquery_result = dsrc.discover('implausible-pkg', version='0')
97-
>>> dquery_result.distribution is None
98-
True
99-
100-
The ``discover`` method returns a discovery query result. This result contains
101-
a distribution object if it knows about the distribution with the given name and
102-
version; the distribution in the query is None otherwise.
111+
>>> len(dquery_result)
112+
0
103113

104114
The discovery version number spec may allow comparison operators, as for
105115
``distutils.version.LooseVersion``::
106116

107117
>>> res = dsrc.discover(name='my-package', version='>=0')
108-
>>> dst = rst.distribution
109-
>>> dst.name
118+
>>> prun = rst[0]
119+
>>> prun.pkg_name
110120
'my-package'
111-
>>> dst.version
121+
>>> prun.meta['version']
112122
'0'
113123

114124
Default discovery sources
@@ -137,51 +147,50 @@ from a list of sources. Something like this::
137147
>>> local_usr = dpkg.get_source('local-user')
138148
>>> src_pool = dpkg.SourcePool((local_usr, local_sys))
139149
>>> dq_res = src_pool.discover('my-package', version='0')
140-
>>> dq_res.distribution.name
150+
>>> dq_res[0].pkg_name
141151
'my-package'
142152

143153
We'll often want to do exactly this, so we'll add this source pool to those that
144154
can be returned from our ``get_source`` convenience function::
145155

146156
>>> src_pool = dpkg.get_source('local-pool')
147157

148-
Register a distribution
149-
=======================
158+
Register a prundle
159+
==================
150160

151-
In order to register a distribution, we need a distribution object and a
161+
In order to register a prundle, we need a prundle object and a
152162
discovery source::
153163

154-
>>> dst = dpkg.LocalPathDistribution.from_path(path=/a/path')
155-
>>> local_usr = dpkg.get_source('local-user')
156-
>>> local_usr.register(dst)
164+
>>> from dang.prundle import LocalPathPrundle
165+
>>> prun = LocalPathDistribution.from_path(path=/a/path')
166+
>>> local_usr = dang.get_source('local-user')
167+
>>> local_usr.register(prun)
157168

158169
Let us then write the source to disk::
159170

160-
>>> local_usr.save()
171+
>>> local_usr.write()
161172

162173
Now, when we start another process as the same user, we can do this::
163174

164-
>>> import dpkg
165-
>>> local_usr = dpkg.get_source('local-user')
166-
>>> dst = local_usr.discover('my-package', '0')
175+
>>> import dang
176+
>>> local_usr = dang.get_source('local-user')
177+
>>> prun = local_usr.discover('my-package', '0')[0]
167178

168179
**************
169180
Implementation
170181
**************
171182

172183
Here are some notes. We had the hope that we could implement something that
173184
would be simple enough that someone using the system would not need our code,
174-
but could work from the specification. In practice we hope to be able get away
175-
with something that uses ``ini`` format files as base storage, because these are
176-
fairly standard and have support in the python standard library since way back.
185+
but could work from the specification.
177186

178-
Local path distributions
179-
========================
187+
Local path prundles
188+
===================
180189

181-
As implied above, these are directories accessible on the local filesystem.
182-
The directory needs to give information about the distribution name and version.
183-
An ``ini`` file is probably enough for this - something like a ``meta.ini`` file
184-
in the directory with::
190+
These are directories accessible on the local filesystem. The directory needs
191+
to give information about the prundle name and optionally, version, tag,
192+
revision id and maybe other metadata. An ``ini`` file is probably enough for
193+
this - something like a ``meta.ini`` file in the directory with::
185194

186195
[DEFAULT]
187196
name = my-package
@@ -192,11 +201,8 @@ might be enough to get started.
192201
Discovery sources
193202
=================
194203

195-
The discovery source has to be able to return distribution objects for the
196-
distributions it knows about. A discovery source might only be able to handle
197-
local path distributions, in which case all it needs to know about a
198-
distribution is the (name, version, path). So, a local path discovery source
199-
could be stored on disk as an ``ini`` file as well::
204+
The discovery source has to be able to return prundle objects for the
205+
prundles it knows about.
200206

201207
[my-package]
202208
0 = /some/path

0 commit comments

Comments
 (0)