Skip to content

Commit 294bb6b

Browse files
authored
Adding NFOSceneParser (#462)
1 parent 56e31db commit 294bb6b

File tree

9 files changed

+1570
-0
lines changed

9 files changed

+1570
-0
lines changed

plugins/nfoSceneParser/README.md

Lines changed: 200 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,200 @@
1+
# nfoFileParser
2+
Automatically and transparently populates your scenes data (during scan) based on either:
3+
- NFO files
4+
- patterns in your file names, configured through regex.
5+
6+
Ideal to "initial load" a large set of new files (or even a whole library) and not "start from scratch" in stash! *...provided you have nfo files and/or consistent patterns in your file names of course...*
7+
8+
# Installation
9+
10+
- If you have not done it yet, install the required python module: `pip install requests` (or `pip3 install requests` depending on your python setup). Note: if you are running stash as a Docker container, this is not needed as it is already installed.
11+
- Download the whole folder `nfoFileParser`
12+
- Place it in your `plugins` folder (where the `config.yml` is)
13+
- Reload plugins (`Settings > Plugins > Reload`)
14+
- `nfoFileParser` appears
15+
- Scan some new files...
16+
17+
The plug-in is automatically triggered on each new scene creation (typically during scan)
18+
19+
# Usage
20+
21+
Imports scene details from nfo files or from regex patterns.
22+
23+
Every time a new scene is created, it will:
24+
- look for a matching NFO file and parse it into the scene data (studio, performers, date, name,...)
25+
- if no NFO are found, it uses a regular expression (regex) to parse your filename for patterns. This fallback works only if you have consistent & identifiable patterns in (some of) your file names. Read carefully below how to configure regex to match your file name pattern(s).
26+
- If none of the above is found: it will do nothing ;-)
27+
28+
NFO complies with KODI's 'Movie Template' specification (https://kodi.wiki/view/NFO_files/Movies). Note: although initially created by KODI, this NFO structure has become a de-facto standard among video management software and is used today far beyond its KODI roots to store the video files's metadata.
29+
30+
regex patterns complies with Python's regular expressions. A good tool to write/test regex is: https://regex101.com/
31+
32+
## config.py
33+
34+
nfoFileParser works without any config edits. If you want more control, have a look at `config.py`, where you can change some default behavior.
35+
36+
## Reload task
37+
38+
nfoFileParser typically processes everything during scan. If you want to reload the nfo/regex at a later time, you can execute a "reload" task.
39+
40+
It works in three steps: configure, select & run:
41+
- Configure: edit `reload_tags` in the plugin's `config.py` file. Set the name to an existing tag in your stash. It is used as the 'marker" tag by the plugin to identify which scenes to reload.
42+
- Select: add the configured tag to your scenes to "mark" them.
43+
- Run: execute the "reload" task: stash's settings -> "Tasks" -> Scroll down to "plugin tasks" / nfoSceneParser (at the bottom) -> "Reload tagged scenes" button
44+
45+
A reload essentially merges the new file data with the existing scene data, giving priority to the nfo/regex content. More specifically:
46+
- For single-value fields, overrides what is already set if another content is found
47+
- For single-value fields, keeps what is already set if nothing is found
48+
- For multi-value fields, adds to existing values.
49+
50+
Note: The marker tag is removed from the reloaded scenes (unless it is present in the nfo or regex) => no need to remove it manually...
51+
52+
# NFO files organization
53+
54+
## Scene NFO
55+
56+
The plugin automatically looks for .nfo files (and optionally thumbnail images) in the same directory and with the same filename as your video file (for instance for a `BestSceneEver.mp4` video, it will look for a corresponding `BestSceneEver.nfo` file). Through config, you can specify an alternate location for your NFO files.
57+
58+
## Folder NFO
59+
60+
If a "folder.nfo" file is present, it will be loaded and applied as default for all scene files within the same folder. A scene specific nfo will override the default from the folder.nfo.
61+
62+
So if you have a folder.nfo, with a studio, or an performer, they will automatically be applied to all scenes in the folder, even if there is no specific nfo for each scene file.
63+
64+
folder.nfo are also used to create movies. See below for details on movie support.
65+
66+
## Image support
67+
68+
Thumbnails images are supported either from `<thumb>` tags in the NFO itself (link to image URL) or alternatively will be loaded from the local disk (following KODI's naming convention for movie artwork). The plug-in will use the first image it finds among:
69+
- A local image with the `-landscape` or `-poster` or no suffix (example: `BestSceneEver-landscape.jpg` or `BestSceneEver.jpg`). If you have movie info in your nfo, two images will be loaded for front & back posters (example: `folder-poster.jpg` and `folder-poster1.jpg`)
70+
- A download of the `<thumb>` tags url (if there are multiple thumb fields in the nfo, uses the one with the "landscape" attribute has priority over "poster").
71+
72+
## Movie support
73+
74+
Movies are automatically found and created in stash from the nfo files. The plugin supports two different alternatives:
75+
- folder.nfo if present contains data valid for all scene files in the same directory. That is the very definition of a movie. The `<title>`tag designate the movie name, with all other relevant tags used to create the movie with all its details (`<date>`, `<studio>`, `<director>`, front/back image from `<thumb>`)
76+
- Inside the scene nfo, through the `<set>` tag that designate the group/set to which multiple scenes belong.
77+
78+
example for `folder.nfo`:
79+
```xml
80+
<movie>
81+
<title>My Movie Title</title>
82+
<plot>You have to see it to believe it...</plot>
83+
<thumb aspect="poster">https://front_cover.jpg</thumb>
84+
<thumb aspect="poster">https://back_cover.jpg</thumb>
85+
<studio>Best studio ever</studio>
86+
<director>Georges Lucas</director>
87+
</movie>
88+
```
89+
90+
example for `BestSceneEver.nfo`:
91+
92+
```xml
93+
<movie>
94+
<title>BestSceneEver</title>
95+
<plot>Scene of the century</plot>
96+
<thumb aspect="landscape">https://scene_cover.jpg</thumb>
97+
<studio>Best studio ever</studio>
98+
<set>
99+
<name>My Movie Title</name>
100+
<index>2</index>
101+
</set>
102+
</movie>
103+
```
104+
105+
## url support
106+
107+
The nfo spec does not officially support `<url>` tags, but given the importance for stash, it is supported by nfoSceneParser as an nfo extension and will be correctly recognized and updated to your scenes and movies.
108+
109+
## Mapping between stash data and nfo fields
110+
111+
stash scene fields | nfo movie fields
112+
------------------------ | ---------------------
113+
`title` | `title` or `originaltitle` or `sorttitle`
114+
`details` | `plot` or `outline` or `tagline`
115+
`studio` | `studio`
116+
`performers` | `actor.name` (sorted by `actor.order`)
117+
`movie` | `set.name` (sorted by `set.index`) or `title` from folder.nfo
118+
`rating` | `userrating` or `ratings.rating`
119+
`tags` | `tag` or `genre`
120+
`date` | `premiered` or `year`
121+
`url` | `url`
122+
`director` (for movies) | `director` (only for folder.nfo)
123+
`cover image` (or `front`/`back`for movies) | `thumb` (or local file)
124+
`id` | `uniqueid`
125+
126+
Note: `uniqueid` support is only for existing stash scenes that were exported before (to they are updated "in place" with their existing id)
127+
128+
129+
130+
131+
# Regex pattern matching
132+
133+
Regular expressions work by recognizing patterns in your files. It is a fallback if no NFO can be found.
134+
135+
You need to configure a custom pattern (like studio, actors or movie) that is specific to your naming convention. So a little bit of configuration is needed to "tell the plugin" how to recognize the right patterns.
136+
137+
patterns use the "regular expression" standard to match patterns (regex).
138+
139+
## Regex configuration - not your typical plugin
140+
141+
A consistent and uniform naming convention across a whole media library is extremely unlikely. Therefore, nfoSceneParser supports not one, but multiple `nfoSceneParser.json` regex config files. They are placed alongside your media files, directly into the library.
142+
143+
A configuration file applies to all files and subdirectories below it.
144+
145+
Config files can be nested inside the library's directories tree. In this case, the deepest and most specific config is always used.
146+
147+
`nfoSceneParser.json` configs are searched and loaded when the plug-in is executed. They can be added, modified or removed while stash is running, without the need to "reload" the plugins.
148+
149+
## File structure `nfoSceneParser.json`
150+
151+
Configuration files consist of one regex and some attributes.
152+
153+
| Name | Required | Description |
154+
| ------------- | -------- | -------------------------------------- |
155+
| regex | true | A regular expression (regex). Regex can be easily learned, previewed and tested via [https://regex101.com/](https://regex101.com/)|
156+
| splitter | false | Used to further split the matched "performers" or 'tags" text into an array of strings (the most frequent use case being a list of actors or tags). For instance, if performers matches to `"Megan Rain, London Keyes"`, a splitter of `", "` will separate the two performers from the matched string |
157+
| scope | false | possible values are "path" or "filename". Whether the regex is applied to the scene's whole path or just the filename. Defaults to "path" |
158+
159+
## Example `nfoSceneParser.json`
160+
161+
Let's assume the following directory and file structure:
162+
163+
`/movies/movie series/Movie Name 17/Studio name - first1 last1, first2 last2 - Scene title - 2017-12-31.mp4`
164+
165+
A common naming convention is used for all files under "movie series" directory => the `nfoSceneParser.json` file is placed in `/movies/movie series`.
166+
167+
We want to identify the following patterns:
168+
- The deepest folder is the `movie`
169+
- The file name has different sections, all separated by the same `' - '` delimiter. We can therefore use this to delimit and match the `studio`, the `performers` and the scene's `title`.
170+
- The `date` is matched automatically. There is nothing to configure for that.
171+
172+
`nfoSceneParser.json` (remember: to be placed in your library)
173+
```json
174+
{
175+
"regex": "^.*[/\\\\](?P<movie>.*?)[/\\](?P<studio>.*?) - (?P<performers>.*?) - (?P<title>.*?)[-]+.*\\.mp4$",
176+
"splitter": ", ",
177+
"scope": "path"
178+
}
179+
```
180+
181+
A quick look at the regex:
182+
- `[/\\]` Matches slash & backslash, making it work on Windows and Unix path alike (Macos, Linux,...)
183+
- Capturing groups like `(?P<movie>.*?)` have name that must match the supported nfoFileParser attributes (see below)
184+
185+
Note: in json, every `\` is escaped to `\\` => `\\` in json is actually `\` in the regex. If you are unfamiliar, look for a json regex formatter online and paste your regex there to get the properly "escaped" string you need to use in the config file.
186+
187+
## Supported regex capturing group names
188+
189+
The following can be used in your regex capturing group names:
190+
- title
191+
- date
192+
- performers
193+
- tags
194+
- studio
195+
- rating
196+
- movie
197+
- director
198+
- index (mapped to stash scene_index - only relevant for movies)
199+
200+
Note: if `date` is not specified, the plug-in attempts to detect the date anywhere in the file name.
Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
import os
2+
3+
4+
class AbstractParser:
5+
6+
empty_default = { "actors": [], "tags": [] }
7+
8+
# Max number if images to process (2 for front/back cover in movies).
9+
_image_Max = 2
10+
11+
def __init__(self):
12+
self._defaults = [self.empty_default]
13+
14+
def _find_in_parents(self, start_path, searched_file):
15+
parent_dir = os.path.dirname(start_path)
16+
file = os.path.join(start_path, searched_file)
17+
if os.path.exists(file):
18+
return file
19+
elif start_path != parent_dir:
20+
# Not found => recurse via parent
21+
return self._find_in_parents(parent_dir, searched_file)
22+
23+
def _get_default(self, key, source=None):
24+
for default in self._defaults:
25+
# Source filter: skip default if it is not of the specified source
26+
if source and default.get("source") != source:
27+
continue
28+
if default.get(key) is not None:
29+
return default.get(key)
30+
31+
def parse(self):
32+
pass

plugins/nfoSceneParser/config.py

Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
# If dry is True, will do a trial run with no permanent changes.
2+
# Look in the log file for what would have been updated...
3+
dry_mode = False
4+
5+
# nfo file location & naming.
6+
# Possible options:
7+
# - "with files": with the video files: Follows NFO standard naming: https://kodi.wiki/view/NFO_files/Movies
8+
# - "...": a specific directory you mention. In this case, the nfo names will match your stash scene ids.
9+
# if you set the above to "with files", it'll force filename anyway, to match the filename.
10+
# ! Not yet implemented. Currently, only "with files" is supported
11+
nfo_location = "with files"
12+
13+
# If True, will never update already "organized" scenes.
14+
skip_organized = True
15+
16+
# If True, will set the scene to "organized" on update from nfo file.
17+
set_organized_nfo = True
18+
19+
# Set of fields that must be set from the nfo (i.e. "not be empty") for the scene to be marked organized.
20+
# Possible values: "performers", "studio", "tags", "movie", "title", "details", "date",
21+
# "rating", "urls" and "cover_image"
22+
set_organized_only_if = ["title", "performers", "details", "date", "studio", "tags", "cover_image"]
23+
24+
# Blacklist: array of nfo fields that will not be loaded into the scene.
25+
# Possible values: "performers", "studio", "tags", "movie", "title", "details", "date",
26+
# "rating", "urls" and "cover_image", "director"
27+
# Note: "tags" is a special case: if blacklisted, new tags will not be created, but existing tags will be mapped.
28+
blacklist = ["rating"]
29+
30+
# List of tags that will never be created or set to the scene.
31+
# Example: blacklisted_tags = ["HD", "Now in HD"]
32+
blacklisted_tags = ["HD", "4K", "Now in HD", "1080p Video", "4k Video"]
33+
34+
# Name of the tag used as 'marker" by the plugin to identify which scenes to reload.
35+
# Empty string or None disables the reload feature
36+
reload_tag = "_NFO_RELOAD"
37+
38+
# Creates missing entities in stash's database (or not)
39+
create_missing_performers = True
40+
create_missing_studios = True
41+
create_missing_tags = True
42+
create_missing_movies = True
43+
44+
###############################################################################
45+
# Do not change config below unless you are absolutely sure of what you do...
46+
###############################################################################
47+
48+
# Wether to Looks for existing entries also in aliases
49+
search_performer_aliases = True
50+
search_studio_aliases = True
51+
52+
levenshtein_distance_tolerance = 2
53+
54+
# "Single names" means performers with only one word as name like "Anna" or "Siri".
55+
# If true, single names aliases will be ignored:
56+
# => only the "main" performer name determines if a performer exists or is created.
57+
# Only relevant if search_performer_aliases is True.
58+
ignore_single_name_performer_aliases = True
59+
60+
# If the above is set to true, it can be overruled for some allowed (whitelisted) names
61+
single_name_whitelist = ["MJFresh", "JMac", "Mazee"]
62+
63+
###############################################################################
64+
# Reminder: if no matching NFO file can be found for the scene, a fallback
65+
# "regular expressions" parsing is supported.
66+
#
67+
# ! regex patterns are defined in their own config files.
68+
#
69+
# See README.md for details
70+
###############################################################################

plugins/nfoSceneParser/log.py

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
import sys
2+
3+
4+
# Log messages sent from a plugin instance are transmitted via stderr and are
5+
# encoded with a prefix consisting of special character SOH, then the log
6+
# level (one of t, d, i, w, e, or p - corresponding to trace, debug, info,
7+
# warning, error and progress levels respectively), then special character
8+
# STX.
9+
#
10+
# The LogTrace, LogDebug, LogInfo, LogWarning, and LogError methods, and their equivalent
11+
# formatted methods are intended for use by plugin instances to transmit log
12+
# messages. The LogProgress method is also intended for sending progress data.
13+
#
14+
15+
def __prefix(level_char):
16+
start_level_char = b'\x01'
17+
end_level_char = b'\x02'
18+
19+
ret = start_level_char + level_char + end_level_char
20+
return ret.decode()
21+
22+
23+
def __log(level_char, s):
24+
if level_char == "":
25+
return
26+
27+
print(__prefix(level_char) + s + "\n", file=sys.stderr, flush=True)
28+
29+
30+
def LogTrace(s):
31+
__log(b't', s)
32+
33+
34+
def LogDebug(s):
35+
__log(b'd', s)
36+
37+
38+
def LogInfo(s):
39+
__log(b'i', s)
40+
41+
42+
def LogWarning(s):
43+
__log(b'w', s)
44+
45+
46+
def LogError(s):
47+
__log(b'e', s)
48+
49+
50+
def LogProgress(p):
51+
progress = min(max(0, p), 1)
52+
__log(b'p', str(progress))

0 commit comments

Comments
 (0)