Conversation
saggu
left a comment
There was a problem hiding this comment.
Every function should have augmented parameters. Lot of changes required. Some repeated mistakes from previous PRs. Please update
| @@ -0,0 +1,177 @@ | |||
| ## Summary | |||
|
|
|||
| This command will augmented graph from a KGTK Edge file with numeric value in float (or date) on node2. This command will automatically detect date in wikidata format and transform it to float in year | |||
There was a problem hiding this comment.
Please fix the grammar. Also I can't understand what this command will do from this description. Please update
|
|
||
| ### The Output File | ||
|
|
||
| The output file is an edge file for each mode that contains the following columns: |
There was a problem hiding this comment.
what extra edges will be added? Example?
| -o OUTPUT_FILE, --output-file OUTPUT_FILE | ||
| The KGTK output file. (May be omitted or '-' for | ||
| stdout.) | ||
| --dataset DATASET Specify the location of dataset. |
There was a problem hiding this comment.
what exactly is the location of dataset?
| The KGTK output file. (May be omitted or '-' for | ||
| stdout.) | ||
| --dataset DATASET Specify the location of dataset. | ||
| --train-file-name TRAIN_FILE_NAME |
There was a problem hiding this comment.
All these new parameters need to have longer descriptions
| Specify name for training file | ||
| --numerical-literal-name NUM_LITERAL_NAME | ||
| Specify name for numerical literal file | ||
| --valid-file-name VALID_FILE_NAME |
There was a problem hiding this comment.
?? I will only add this comment here, add longer description help messages
| collections_raw = defaultdict(list) | ||
|
|
||
| if train_edges_raw is not None: | ||
| for i, row in train_edges_raw.iterrows(): |
There was a problem hiding this comment.
remove df.iterrows(), it is the most inefficient function
| import pandas as pd | ||
| from tqdm import tqdm | ||
| from bisect import bisect | ||
| from kgtk.augment.utils import * |
|
|
||
| def gen_plabel(pnode, unit=None): | ||
| if not unit: | ||
| return pnode + ' (Interval)' |
There was a problem hiding this comment.
use f strings everywhere, be consistent
| parser.add_output_file() | ||
|
|
||
| parser.add_argument('--dataset', dest='dataset', type=str, | ||
| default=None, |
There was a problem hiding this comment.
add longer meaningful help messages
| from kgtk.cli_entry import cli_entry | ||
| from kgtk.exceptions import KGTKArgumentParseException | ||
| import glob | ||
|
|
There was a problem hiding this comment.
for all of unit tests, you have to test on content also, instead of only length
No description provided.