Extract miscellaneous information recorded in AirDAS data comments, i.e. comment-data
Usage
airdas_comments_process(x, ...)
# S3 method for class 'data.frame'
airdas_comments_process(x, ...)
# S3 method for class 'airdas_dfr'
airdas_comments_process(x, comment.format = NULL, ...)
# S3 method for class 'airdas_df'
airdas_comments_process(x, comment.format = NULL, ...)Value
x, filtered for comments with recorded data,
with the following columns added:
comment_str: the full comment string
Misc#: Some number of descriptor columns. There should be
ncolumns, although the minimum number will be two columnsValue: Associated count or percentage for TURTLE/PHOCOENA data
flag_check: logical indicating if the TURTLE/PHOCOENA comment string was longer than an expected number of characters, and thus should be manually inspected
See the additional sections for more context. If comment.format is
NULL, then the output data frame would two Misc# columns: a level
one descriptor, e.g. "Fish ball" or "Jellyfish", and a level two
descriptor, e.g. s, m, or c. However, if comment.format$n is say 4,
then the output data frame would have columns Misc1, Misc2, Misc3, and
Misc4.
Messages are printed if either comment.format is not NULL
and not comment-data is identified using comment.format, or if
x has TURTLE/PHOCOENA data but no TURTLE/PHOCOENA comment-data
Details
Historically, project-specific or miscellaneous data have been recorded in AirDAS comments using specific formats and character codes. This functions identifies and extracts this data from the comment text strings. However, different data types have different comment-data formats. Specifically, TURTLE and PHOCOENA comment-data uses identifier codes that each signify a certain data pattern, while other comment-data (usually that of CARETTA) uses data separated by some delimiter.
TURTLE and PHOCOENA comment-data
Current supported data types are: fish balls, molas, jellyfish, and crab
pots. See any of the AirDAS format PDFs (airdas_format_pdf)
for information about the specific codes and formats used to record this
data. All comments are converted to lower case for processing to avoid
missing data.
These different codes contain (at most): a level one descriptor (e.g. fish ball or crab pot), a level two descriptor (e.g. size or jellyfish species), and a value (a count or percentage). Thus, the extracted data are returned together in this structure. The output data frame is long data, i.e. it has one piece of information per line. For instance, if the comment is "fb1s fb1m", then the output data frame will have one line for the small fish ball and one for the medium fish ball. See Value section for more details.
Currently this function only recognizes mola data recorded using the "m1", "m2", and "m3" codes (small, medium, and large mola, respectively). Thus, "mola" is not recognized and processed.
The following codes are used for the level two descriptors:
| Description | Code | 
| Small | s | 
| Medium | m | 
| Large | l | 
| Unknown | u | 
| Chrysaora | c | 
| Moon jelly | m | 
| Egg yolk | e | 
| Other | o | 
Using comment.format
comment.format is a list that allows the user to specify the
comment-data format. To use this argument, data must be separated by a
delimiter. This list must contain three named elements:
n: A single number indicating the number of elements of data in each comment. Must equal the length of
type. A comment must contain exactly this number ofsepto be recognized as comment-datasep: A single string indicating the field separator string (delimiter). Values within each comment are separated by this string. Currently accepted values are ";" and ","
type: A character vector of length
nindicating the data type of each data element (column). All values must be one of: "character", "numeric", or "integer".
For instance, for most CARETTA data comment.format should be
list(n = 5, sep = ";", type = c("character", "character", "numeric",
 "numeric", "character"))
Examples
y <- system.file("extdata", "airdas_sample.das", package = "swfscAirDAS")
y.proc <- airdas_process(y)
airdas_comments_process(y.proc)
#>   Event            DateTime      Lat       Lon OnEffort Trans Bft CCover Jelly
#> 1     C 2015-04-09 12:31:48 39.23350 -123.1857     TRUE    T1   1     10     0
#> 2     C 2015-04-09 12:42:01 39.23267 -123.5433     TRUE    T1   1     20     0
#> 3     C 2015-04-09 12:42:01 39.23267 -123.5433     TRUE    T1   1     20     0
#>   HorizSun VertSun HKR  Haze  Kelp RedTide AltFt SpKnot ObsL ObsB ObsR Rec
#> 1        6      NA   n FALSE FALSE   FALSE   650    100   aa   bb   cc  dd
#> 2        6      NA   n FALSE FALSE   FALSE   650    100   aa   bb   cc  dd
#> 3        6      NA   n FALSE FALSE   FALSE   650    100   aa   bb   cc  dd
#>   ObsLR ObsRR VLI VLO VB VRI VRO Data1 Data2 Data3 Data4 Data5 Data6 Data7
#> 1  <NA>  <NA>   g   g  g   g   g  2 cp  <NA>  <NA>  <NA>  <NA>  <NA>  <NA>
#> 2  <NA>  <NA>   g   g  g   g   g  fb2s  fb1m  <NA>  <NA>  <NA>  <NA>  <NA>
#> 3  <NA>  <NA>   g   g  g   g   g  fb2s  fb1m  <NA>  <NA>  <NA>  <NA>  <NA>
#>   EffortDot EventNum          file_das line_num file_type comment_str     Misc1
#> 1      TRUE       11 airdas_sample.das       11    turtle        2 cp  crab pot
#> 2      TRUE       45 airdas_sample.das       46    turtle   fb2s fb1m fish ball
#> 3      TRUE       45 airdas_sample.das       46    turtle   fb2s fb1m fish ball
#>   Misc2 Value flag_check
#> 1  <NA>     2      FALSE
#> 2     s     2      FALSE
#> 3     m     1      FALSE