diff --git a/FPM.ftd b/FPM.ftd index 78994fbac..719e0d3c2 100644 --- a/FPM.ftd +++ b/FPM.ftd @@ -100,6 +100,7 @@ implements: fifthtry.github.io/theme - Contribution Guide: /dev/contribute/ - How Versioning Works: /dev/versioning/ - What is `FPM.manifest.ftd`?: /dev/fpm-manifest/ +- What is `digest.json`?: /dev/digest/ - File System Organization: /fs/ - `FPM.ftd`: /fpm-ftd-file/ - Day To Day: diff --git a/dev/digest.ftd b/dev/digest.ftd new file mode 100644 index 000000000..9f5190974 --- /dev/null +++ b/dev/digest.ftd @@ -0,0 +1,283 @@ +-- ft.page: FPM New Design + +Currently `fpm` cli only works against local file system. We want to now evolve +into something that can both use local file system, and work off a database. + +-- ft.h1: `fpm serve` is "dynamic" + +`fpm serve` runs an HTTP server, and serves the FTD and other files in current +fpm package dynamically. Contrast it with `fpm build && cd .build && python +http.server`, when running like this we always get the same content, since +`fpm build` runs once, and builds all the files in `.build` folder. + +Now it's a static file, and `python http.server` always serves the same content +for same URL. `fpm serve` does this on the fly, when the HTTP request comes. +Which means if the ftd file is dynamic, say ftd file uses a processor that +fetches data from database, such a file would refetch the data on every HTTP +request when served via `fpm serve` but when `fpm build` is used, at the time +of build the data would be fetched. + +-- ft.h1: Hosting `fpm serve` With Files + +Since `fpm serve` is dynamic, it offers authors write more powerful documents, +they can use `ftd/fpm` better. But `spm serve` so far is only running locally, +on your machine, exposing the server on `http://127.0.0.1:8000`. + +You can run `fpm serve` on your own VPS on Digital Ocean or EC2. You will have +to checkout your FPM package content on the VPS. + +For this we do not need anything more, the moment [`fpm serve` dynamically +PR](https://github.com/FifthTry/fpm/pull/201) is merged we are good to go. + +But it has one main disadvantage: managing FTD files would still have to be +done why `sFTP/FTP/git` etc. When the fpm package content changes, you will have +to deploy the content to your server somehow. + + +-- ft.h1: FPM Package Digest: `digest.json` + +`digest.json` is an in memory data structure which is generated first when any +`fpm` command is invoked locally. Content of `digest.json` is also stored in +`packages` table on remote. + +For every FTD and markdown file in the package, it will contain the content of +that file. + +For every non FTD/md file it will contain the filename, which will be stored as +a list in `other-files` key. + +It will also contain history and tracking metadata. + +-- ft.code: `digest.json` +lang: json + +{ + "index.ftd": "content of index.ftd", + "other-files": [ + "images/logo.png", + "foo.md", + "hello.py" + ], + "history": { + "index.ftd": [ + { "updated": "" }, + { "created": "" } + ] + }, + "tracks": { + "index.ftd": [ + { + "foo.ftd": { + "last-merged-on": "" + } + } + ] + } +} + +-- ft.markdown: + +NOTE: we are showing timestamps as "string", but it will be actually be integer. + +When we download a package, we will first extract it's content and create +`.packages/.digest.json`. + +-- ft.h1: What's In DB? + +-- ft.h2: `other-files` table + +For every "other-file", we will have a row containing the content of that file. + +-- ft.code: +lang: sql + +select * from other_files; +| filename | content of file | +| amitu.com/foo.png | | +| fifthtry.com/logo.png | | + +-- ft.h2: `packages` table + +For every package, we will also have a row containing the package.digest.json +for all packages. + +-- ft.code: +lang: sql + +select * from packages; +| package name | content of digest.json | ++---------------------+-------------------------------------------| +| amitu.com | | +| amitu.com/x | | +| amitu.com/x/y | | +| fifthtry.com | | + +-- ft.h2: `history` table + +The history will be stored in history table: + +-- ft.code: +lang: sql + +select * from history; +| filename | timestamp | event | content | +| amitu.com/index.ftd | | created | | +| amitu.com/index.ftd | | updated | | + +-- ft.markdown: + +In future we will also store the `diff` column, which will show the difference +in this document with respect to previous row. + +And also `cr` column, which will contain if this change came from a cr, if so +the cr number, if change happened directly on main, `cr` will be null. + +-- ft.h2: `fpm-files` table + +Since FPM.ftd file is needed to serve every static file (because FPM.ftd +contains authentication/access control information), we have read it very +frequently, so we will keep this in a separate table only to be used when +serving static file requests. + +-- ft.code: +lang: sql + +select * from fpm-files; + +| package name | FPM.ftd content | ++--------------+-------------------------+ +| amitu.com | | + + + +-- ft.h1: How To Load .digest.json On Local? + +Look for `.fpm-workspace/digest.json`. + +-- ft.h1: How To Load .digest.json On Remote? + +Say if db looks like this: + +-- ft.code: +lang: sql + +select * from packages; +| package | content | ++----------------------+-------------------------------------------+ +| amitu.com/ | | +| amitu.com/x/ | | +| amitu.com/x/y/ | | +| fifthtry.com/ | | + +-- ft.markdown: + +And a request to `amitu.com/foo` comes, we have to read the content of +`amitu.com` row. But if `amitu/x/y/z` comes then we have to read of content of +`amitu.com/x/y`. How do we do know which row to read? + +Step 1: find the domain name only, `amitu.com`, and then look for all rows: + +-- ft.code: +lang: sql + +select package from packages where package ilike "amitu.com/%" + +| amitu.com/ | +| amitu.com/x/ | +| amitu.com/x/y/ | + +-- ft.markdown: + +Step 2: Find the largest `package` from this list, where +`"amitu.com/foo".starts_with(package)` is true. Here the answer would be +`amitu.com/`, so this is the package that contains `amitu.com/foo`. + +Now `select content from packages where package=amitu.com`. + +-- ft.h1: How to serve the file? + +A request has come to fpm http server, we want to serve it. + + +-- ft.h2: FTD files: URL with no extension, or URL ending with `index.html` + +If URL ends with index.html, we will delete the ending index.html to get the +`real path`. + + +If the `real path` is present in `digets` json for the package, eg if the +digest was: + + +-- ft.code: +lang: json + +{ + "foo/index.ftd": "", + "FPM.ftd": "" + ... other stuff omitted ... +} + +-- ft.markdown: + +NOTE: if the URL contains `-` then the `real path` would be the part till the +first `-`, eg if path was `amitu.com/foo/-/bar/`, the real path would be `foo`. +Ask Arpita how to handle such URLs. + +If the `real path` is `foo`, then we look for `` (`foo.ftd`) or +`/index.ftd` (`foo/index.ftd`) has to be present. + +We know it is a ftd file and we can serve it. When "process_ftd()` is called, +on every import (`ftd::Interpreter::StuckOnImport`) corresponding to a foreign +package we have to read the row for that package from `packages` table (or from +disc, `.packages/.digest.json` if running locally). Here we do +not have to do step 1, because the list of dependencies for this package already +known, so we have to do step 1 equivalent from content of FPM.ftd only, we do +not have to do additional (db/fs) read. + +If the file is missing in `digest.json` then we give 404. + +-- ft.h2: Any other URL (non ftd url, mostly images) + +We assume this is static file, so we only do the step 1, and not step 2 as we +are not going to need the full digest. We will read FPM.ftd content from +`fpm-files` for the package name we found from step 1. Using FPM.ftd we will +check if auth/acl allows you to read the static file. If not we give 403. + +If acl works, we then do a query on the `other-files` table, to get the content +of that file and serve it. + +-- ft.h1: How do we handle markdown files? + +Markdown files (and their content) would also be present in `digest.json` file. +When looking for `foo` we will look for `foo.ftd`, `foo/index.ftd`, `foo.md` +and `foo/README.md`, in this order. + +-- ft.h1: On local, how often do we regenerate `digest.json`? + +When we lauch `fpm serve`, it first reads the content of current package +directory, and generates an in-memory `digest.json` and starts a background +thread which watches for file system changes, and keep updating the in-memory +`digest.json`. + +-- ft.h2: What about local changes in dependencies? + +Since people may modify dependencies as part of development, we have to +generate `digest.json` when any fpm command starts. The file system watcher of +FPM serve will also watch the dependencies, and update their `digest.json` as +well. + +Note: local changes in dependencies will make output of subsequent fpm commands +unreliable. The diff may be wrong if you are tracking things, `fpm status` may +give wrong info etc etc. In future we will use "fallback to .." technique, so if +you want to develop a dependency, you do not modify it in `.packages` folder, +but checkout the dependency in `..` of current package, and our dependency +checker will look both in `.packages` and `..` when its looking for +`foo.com/bar` package as a dependency of current package. + + +-- ft.h1: `digest.json` on remote + +Is only updated when someone does `fpm sync` and any of the content on remote +changes. Since the digest contains history, meta data for all static file, so +if any file ever changes, `digest.json` will have to be updated.