Ye JOSS Review Comments

Hi, 

This seems like a very handy tool for resolving a tedious aspect of computing fluxes from raw sensor data.

Below are the detailed comments for my [JOSS review](https://github.com/openjournals/joss-reviews/issues/1751):

## Functionality:
* `Load_LGR()`
  - The description needs some more details about the data processing that is occurring - is there cleaning, transformation, etc. that is occurring?
  - the `Time_format` argument only takes a few values. Preferred style for handling this is using `match.arg()`, but in this case, I'd recommend re-using the `orders` argument from `lubridate::parse_date_time()` and passing the argument through.

* `Load_Other`
  - I was a bit confused about the `Date` argument, in that it represents a constant date on which the measurements were made, if such data were not present in the file. Mainly because most of the other arguments refer to columns of the input data file. Maybe you could use some other argument naming scheme to indicate such (and then add a similar additional argument for temperature, instead of reusing `Ta` in multiple ways)
  - same comment re: `Time_format` as for `Load_LGR()`
  
* `FluxCal`
  - In interactive mode, there are no instructions about advancing past the manual selection. I had to look up the help on `identify` to figure out that I needed to click a different mouse button. Unfortunately this didn't work on my laptop, and I had to plug in a mouse. There needs to be instruction for how to proceed, as well as a more accessible option.
  - Depending on how I selected peaks and valleys in interactive mode for the LGR example data, I got an error and the next figure failed to draw:
  ```
  Error in if (Slm$r.squared > R2.CO2[a, 1]) { : 
  missing value where TRUE/FALSE needed
  In addition: Warning message:
  ‘mode(display)’ differs between new and previous
	 ==> NOT changing ‘display’ 
  ```
  - In terms of output format, I wasn't sure what the index and time values referred to? The middle of the range that had the best linear R^2 or the start? I'd suggest using start and end values that bracket the range (this makes it easier to reproduce the linear regressions or tweak if necessary). I think the output would be easier to manipulate if CO2 and CH4 results were not in wide format, but long (since the columns are identical. 
  - For the option to take in timing cues, it'd be more convenient to supply this as a data argument rather than a file, as this enables processing or generating the timing cues in R. Also, in the example, it looks like there is additional metadata (the `Plot` and `Light.Dark` columns) associated with the timing cues that is ignored. It would be great to include this as part of the output, for easier analysis purposes (than having to later match the output with the metadata in the timing file).
  - Is the precision sufficient on the molar mass constants? Maybe the output needs to be rounded to the corresponding significant digits. (Oh, these values are not later used?)
  - There's some duplicated code that could be avoided:
    + calculate ylim manually if not provided, then use the same plotting code
    + calculations for CO2 and CH4 are basically identical
  - The function is long, and subtasks can be handled by separate functions (e.g. generating the initial timing cues, the calculations, and then re-plotting or generating the output). A single function to do the flux calculations could be re-used for both CO2 and CH4, for example.
  - Are there default values for `vol` and `Area` that would work? It's a bit awkward to have them be required arguments, but not at the front of the list of arguments. Maybe a unitless calculation could work?

* The software paper mention that the analysis is "reproducible" -- but if the peaks and valleys are manually chosen, this seems to be non-reproducible. Perhaps you can add an option for selecting peaks and valleys to be turned into a timing cues data structure and output file, to make the calculations reproducible, with the manual interaction only needing to happen once.

* You might try running `goodpractice::gp()` to look for additional issues. Not all of them are relevant for JOSS, but it can identify some general ways to improve the package.

## Code Style:
* the naming style for arguments and functions is a bit inconsistent: e.g. the data-loading functions use Snake_case, while the main computation function is CamelCase. Function arguments sometimes begin with a capital letter, and sometimes do not.
* There are several "magic numbers" - constants used in calculation (frequency as 60 / sampling interval, celsius to kelvin conversion using 273.2, and column indices).

## Error checking:
* Since several of the function arguments refer to column names, some error checking with informative messages would be good to add, when a user makes a typo.

## Documentation:
* roxygen has functionality to create link to other topics (e.g. `\link{\code{}}`). There are a few spots where this would be useful in the docs.
* I agree that a simple vignette showing usage of the functions, perhaps with screenshots for the interactive portion, would help understand the workflow, as well as the input and output data structures.
  
## Software paper:
* editing wise, the only additional note I have to [@jhollist's list](https://github.com/junbinzhao/FluxCalR/issues/2) is:
  Paragraph 1, line 4: "for" -> "because of".
* I'm not sure what about the process of computing fluxes is considered "subjective" and "objective". Isn't the selection of peaks and valleys by hand (or inputting timing cues) also subjective? I think you may want to focus on how your tool makes it easier to select the windows for calculation, and that optimizing for R^2 gets the most consistent outputs.
* To add to [@jhollist's comment](https://github.com/junbinzhao/FluxCalR/issues/2) about the Statement of need, the description contained details that were difficult to distinguish as being relevant or not to the core problem. It would be better to have a simple statement about the problem of selecting measurement windows from the raw data; and then to include additional details about how the package is able to make use of air temperature for computing fluxes. I would save the description of efficiency and benefits of the package for the third paragraph, so that you have a clear description of the functionality first.
* Improved precision in wording is needed in some places. What do you mean by "processing the manually measured fluxes"? From the examples and description, it seems like the sensor is automated, and the final output are flux calculations.
* Is the method of selecting the window with the highest R^2 a commonly accepted method, or compared against other methods? A reference to a methods paper feels warranted here.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ye JOSS Review Comments #3

Functionality:

Code Style:

Error checking:

Documentation:

Software paper:

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Ye JOSS Review Comments #3

Description

Functionality:

Code Style:

Error checking:

Documentation:

Software paper:

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions