Create R6 reference object class full_trips
t3::list_t3
-> full_trips
filter_by_years_period()
Function for filter full trips by a year(s) period.
years_period
Object of class integer expected. Year(s) in 4 digits format.
rf1()
Process of Raising Factor level 1 (RF1 and RF2) calculation.
The RF1 aim to correct the weight visual estimation bias of catches filled in logbooks and correspond to the ratio of landing weight on catch weight, of the species defined by the species_fao_codes_rf1
argument.
Raising Factor 2 (RF2) is not implemented yet but will be computed at this step.
full_trips$rf1(
rf1_computation = TRUE,
apply_rf1_on_bycatch = TRUE,
species_fao_codes_rf1 = c("YFT", "SKJ", "BET", "ALB", "LOT", "MIX", "TUN"),
species_fate_codes_rf1 = as.integer(6),
vessel_type_codes_rf1 = as.integer(c(4, 5, 6)),
rf1_lowest_limit = 0.8,
rf1_highest_limit = 1.2,
global_output_path = NULL
)
rf1_computation
Object of class logical
expected. If FALSE rf1 is not calculated (rf1=1 for all trips).
By default TRUE, the rf1 is calculated for each trip.
apply_rf1_on_bycatch
Object of class logical
expected. By default TRUE, rf1 values will be applied to all the logbook catches associated to the trip, including by-catch species.
If FALSE, only the catch weights of species belonging to the species list, defined by the species_fao_codes_rf1
argument are corrected, rf1 is not applied to by-catch species.
species_fao_codes_rf1
Object of type character
expected.Specie(s) FAO code(s) used for the RF1 process.
By default, use codes YFT (Thunnus albacares), SKJ (Katsuwonus pelamis), BET (Thunnus obesus), ALB (Thunnus alalunga),
LOT (Thunnus tonggol) and TUN/MIX (mix of tunas species in Observe/AVDTH database) (French and Mayotte fleets).
species_fate_codes_rf1
Object of type integer
expected. By default 6 ("Retained, presumably destined for the cannery"). Specie(s) fate code(s) used for the RF1 process.
vessel_type_codes_rf1
Object of type integer
expected. By default 4, 5 and 6. Vessel type(s).
rf1_lowest_limit
Object of type numeric
expected. Verification value for the lowest limit of the RF1. By default 0.8.
rf1_highest_limit
Object of type numeric
expected. Verification value for the highest limit of the RF1. By default 1.2.
global_output_path
By default object of type NULL
but object of type character
. Path of the global outputs directory. The function will create subsection if necessary.
By default NULL, for no outputs extraction. Outputs will be extracted, only if a global_output_path is specified.
If a global_output_path is specified, the following outputs are extracted and saved in ".csv" format under the path: "global_output_path/level1/data/".
process_1_1_detail: a table (.csv) with as many rows as elementary catches and 23 columns:
full_trip_id: retained full trip id, type integer
.
full_trip_name: full trip id, type integer
.
trip_id: trip identification (unique topiaid from database), type character
.
activity_id: activity identification (unique topiaid from database), type character
.
activity_latitude: activity latitude, type numeric
.
activity_longitude: activity longitude, type numeric
.
trip_end_date: trip end date (y-m-d format), type character
.
year_trip_end_date: year of trip end, type integer
.
vessel_code: vessel code, type integer
.
vessel_type_code: vessel type code, type integer
.
rf1: raising factor to correct the weight visual estimation bias of catches filled in logbooks.
Rf1 is the ratio of landing weight on catch weight, of the species defined by the species_fao_codes_rf1
argument.
, type numeric
.
statut_rf1: status rf1, type character
.
rf2: raising factor to correct missing logbook(s) not implemented yet (rf2=1), type numeric
.
statut_rf2: status rf2, type character
.
species_fao_code: species FAO code, type character
.
elementarycatch_id: elementary catch identification (unique topiaid from database), type character
.
species_fate_code: species fate codes, type integer
.
For example in Observe database :
4 : discarded alive.
5 : discarded dead.
6 : Retained, presumably destined for the cannery
8 : used for crew consumption on board.
11 : discarded status unknown (only for EMS and logbook).
15 : retained for local market or dried/salted fish on board.
landing_weight: landing weight (without local market), in tonnes, type numeric
.
catch_weight: catch weight (visual estimation), in tonnes, type numeric
.
catch_count: catch count, type integer
.
catch_weight_rf2: catch weight after visual estimation correction, in tonnes: catch_weight_rf2=catch_weight x rf1 (x rf2)
(Process 1.1: Raising Factors level 1), type numeric
.
statut_rf1_label: status rf1 label, type character
.
statut_rf2_label: status rf2 label, type character
.
process_1_1_global: a table (.csv) with as many rows as full trips and 17 columns:
full_trip_id: retained full trip id, type integer
.
full_trip_name: full trip id, type integer
.
trip_id: trip identification (unique topiaid from database), type character
.
trip_end_date: trip end date (y-m-d format), type character
.
year_trip_end_date: year of trip end, type integer
.
vessel_code: vessel code, type integer
.
vessel_type_code: vessel type code, type integer
.
rf1: raising factor to correct the weight visual estimation bias of catches filled in logbooks.
Rf1 is the ratio of landing weight on catch weight, of the species defined by the species_fao_codes_rf1
argument.
, type numeric
.
statut_rf1: status rf1, type character
.
rf2: raising factor to correct missing logbook(s) not implemented yet (rf2=1), type numeric
.
statut_rf2: status rf2, type character
.
landing_weight: landing weight (without local market), in tonnes, type numeric
.
catch_weight: catch weight (visual estimation), in tonnes, type numeric
.
catch_count: catch count, type integer
.
catch_weight_rf2: catch weight after visual estimation correction (tonnes): catch_weight_rf2=catch_weight x rf1 (x rf2)
(full_trips$rf1()
), type numeric
.
statut_rf1_label: status rf1 label, type character
.
statut_rf2_label: status rf2 label, type character
.
conversion_weight_category()
Process of logbook weight categories conversion.
Logbook's weight categories change from one tuna fishing company to another, that involves overlaps and are hardly usable directly from the logbook.
This process aims to homogenize these weight categories and create simplify categories divided in function of the fishing school and the ocean:
< 10kg and > 10kg for the floating object school in the Atlantic and Indian Ocean,
< 10kg, 10-30kg and > 30kg for undetermined and free school in the Atlantic Ocean,
< 10kg and > 10kg for undetermined and free school in the Indian Ocean.
For each layer ocean/fishing school/specie/logbook weight category, a distribution key is applied for conversion to standardized weight categories. Details of the distribution key is available in the vignette Process 1.2: logbook weight categories conversion.
full_trips$conversion_weight_category(
global_output_path = NULL,
referential_template = "observe"
)
global_output_path
By default object of type NULL
but object of type character
. Path of the global outputs directory. The function will create subsection if necessary.
By default NULL, for no outputs extraction. Outputs will be extracted, only if a global_output_path is specified.
referential_template
Object of class character
expected. By default "observe". Referential template selected (for example regarding the activity_code). You can switch to "avdth".
If a global_output_path is specified, the following output is extracted and saved in ".csv" format under the path: "global_output_path/level1/data/".
process_1_2: a table (.csv) with as many rows as elementary catches, plus the catches resulting from the conversion of weight categories and 23 columns:
full_trip_id: retained full trip id, type integer
.
full_trip_name: full trip id, type integer
.
trip_id: trip identification (unique topiaid from database), type character
.
trip_end_date: trip end date, type character
.
year_trip_end_date: year of trip end, type integer
.
vessel_code: vessel code, type integer
.
vessel_type_code: vessel type code, type integer
.
activity_id: activity identification (unique topiaid from database), type character
.
activity_latitude: activity latitude, type numeric
.
activity_longitude: activity longitude, type numeric
.
activity_date: activity date, type POSIXct
.
ocean_code: ocean code, type integer
.
For example ocean_code=1
for the Atlantic Ocean and ocean_code=2
the Indian Ocean.
school_type_code: school type code, type integer
.
In Observe referential template: 1 for floating object school, 2 for free school and 0 for undetermined school.
elementarycatch_id: elementary catch identification (unique topiaid from database), type character
.
species_fao_code: species FAO code, type character
.
weight_category_code: weight category code defined in logbooks, type character
.
weight_category_min: weight category's lower limit (kg), type numeric
.
weight_category_max: weight category's upper limit (kg), type numeric
.
weight_category_label: weight category label defined in logbooks, type character
.
catch_weight_rf2: catch weight after visual estimation correction (tonnes): catch_weight_rf2=catch_weight x rf1 (x rf2)
(Process 1.1: Raising Factors level 1), type numeric
.
weight_category_code_corrected: weight category after conversion, type character
.
catch_weigh_category_code_corrected: catch weight after weight category conversion (tonnes), type numeric
.
In fact, the catch weight corresponding to the logbook weight category can be divided between several corrected weight categories according to the distribution key applied for conversion to standardized weight categories.
catch_count: catch count, type integer
.
set_count()
Process for positive sets count.
global_output_path
By default object of type NULL
but object of type character
. Path of the global outputs directory. The function will create subsection if necessary.
By default NULL, for no outputs extraction. Outputs will be extracted, only if a global_output_path is specified.
referential_template
Object of class character
expected. By default "observe". Referential template selected (for example regarding the activity_code). You can switch to "avdth".
If a global_output_path is specified, the following output is extracted and saved in ".csv" format under the path: "global_output_path/level1/data/".
process_1_3: a table (.csv) with as many rows as activities and 15 columns:
full_trip_id: retained full trip id, type integer
.
full_trip_name: full trip id, type integer
.
trip_id: trip identification (unique topiaid from database), type character
.
trip_end_date: trip end date, type character
.
year_trip_end_date: year of trip end, type integer
.
vessel_code: vessel code, type integer
.
vessel_type_code: vessel type code, type integer
.
activity_id: activity identification (unique topiaid from database), type character
.
activity_latitude: activity latitude, type numeric
.
activity_longitude: activity longitude, type numeric
.
activity_date: activity date, type POSIXct
.
activity_code: activity code to define the type of activity, type integer
.
ocean_code: ocean code, type integer
.
For example ocean_code=1
for the Atlantic Ocean and ocean_code=2
the Indian Ocean.
school_type_code: school type code, type integer
.
In Observe referential template: 1 for floating object school, 2 for free school and 0 for undetermined school.
positive_set_count: count of positive set (catch weight and/or catch count not zero), type integer
.
fishing_effort()
Process for set duration, time at sea, fishing time and searching time calculation (in hours). Details about the methods are available in the vignette : Process 1.4: Fishing effort indicators calculation.
full_trips$fishing_effort(
set_duration_ref,
activity_code_ref,
sunrise_schema = "sunrise",
sunset_schema = "sunset",
global_output_path = NULL,
referential_template = "observe"
)
set_duration_ref
Object of type data.frame
or tbl_df
expected.
Data and parameters for set duration calculation (by year, country, ocean and school type), in the same format as the referential set duration table.
Duration in minutes in the reference table, converted into hours in output for subsequent processing).
activity_code_ref
Object of type data.frame
or tbl_df
expected.
Reference table with the activity codes to be taken into account for the allocation of sea and/or fishing time,
and/or searching time and/or set duration.
sunrise_schema
Object of class character expected. Sunrise characteristic. By default "sunrise" (top edge of the sun appears on the horizon). See below for more details.
sunset_schema
Object of class character expected. Sunset characteristic. By default "sunset" (sun disappears below the horizon, evening civil twilight starts). See below for more details.
global_output_path
By default object of type NULL
but object of type character
expected if parameter outputs_extraction equal TRUE.
Path of the global outputs directory. The function will create subsection if necessary.
By default NULL, for no outputs extraction. Outputs will be extracted, only if a global_output_path is specified.
referential_template
Object of class character
expected. By default "observe".
Referential template selected (for example regarding the activity_code). You can switch to "avdth".
Available variables are:
"sunrise": sunrise (top edge of the sun appears on the horizon)
"sunriseEnd": sunrise ends (bottom edge of the sun touches the horizon)
"goldenHourEnd": morning golden hour ends(soft light, best time for photography)
"solarNoon": solar noon (sun is in the highest position)
"goldenHour": evening golden hour starts
"sunsetStart": sunset starts (bottom edge of the sun touches the horizon)
"sunset": sunset (sun disappears below the horizon, evening civil twilight starts)
"dusk": dusk (evening nautical twilight starts)
"nauticalDusk": nautical dusk (evening astronomical twilight starts)
"night": night starts (dark enough for astronomical observations)
"nadir": nadir (darkest moment of the night, sun is in the lowest position)
"nightEnd": night ends (morning astronomical twilight starts)
"nauticalDawn": nautical dawn (morning nautical twilight starts)
"dawn": dawn (morning nautical twilight ends, morning civil twilight starts)
If a global_output_path is specified, the following output is extracted and saved in ".csv" format under the path: "global_output_path/level1/data/".
process_1_4: a table (.csv) with as many rows as activities and 20 columns:
full_trip_id: retained full trip id, type integer
.
full_trip_name: full trip id, type integer
.
trip_id: trip identification (unique topiaid from database), type character
.
trip_end_date: trip end date, type character
.
year_trip_end_date: year of trip end, type integer
.
vessel_code: vessel code, type integer
.
vessel_type_code: vessel type code, type integer
.
activity_id: activity identification (unique topiaid from database), type character
.
activity_latitude: activity latitude, type numeric
.
activity_longitude: activity longitude, type numeric
.
activity_date: activity date, type POSIXct
.
activity_code: activity code to define the type of activity, type integer
.
objectoperation_code: object operation code to define the type of floating object operation (in Observe referential), type character
.
ocean_code: ocean code, type integer
.
For example ocean_code=1
for the Atlantic Ocean and ocean_code=2
the Indian Ocean.
school_type_code: school type code, type integer
.
In Observe referential template: 1 for floating object school, 2 for free school and 0 for undetermined school.
positive_set_count: count of positive set (catch weight and/or catch count not zero), type integer
.
set_duration: set duration in hours, according to the referential set duration table, type numeric
.
time_at_sea: time at sea in hours, type numeric
.
fishing_time: fishing time in hours, type numeric
.
searching_time: searching time in hours, type numeric
.
Equal to the fishing time value minus the sum of the sets duration values.
sample_length_class_ld1_to_lf()
Process for length conversion, if necessary, in length fork (lf).
In fact, during the sampling process, fishes length can be collected and expressed in different standards.
For example, regarding field constraints and more precisely the length of the different species, sampling data covered in T3 can by express in first dorsal length (LD1) or curved fork length (LF).
Generally, length of small individuals are provided in LF because it's logistically possible and easier to measure the entire fish, while length of bigger individuals are provided in LD1, for the same reciprocal reasons.
This step aims to standardize this standard among sampling data and at the end have only length sampling data expressed in LF.
Historical and so far, the process use a referential conversion table LD1 to LF.
In addition, the sample_number_measured
variable, in this step will be converted to a sample_number_measured_lf
variable (notably due to the creation of new samples to split one LD1 class in multiples LF classes during certain conversions).
full_trips$sample_length_class_ld1_to_lf(
length_step,
global_output_path = NULL,
referential_template = "observe"
)
length_step
Object of type data.frame
or tbl_df
expected.
Data frame object with length ratio between ld1 and lf class, in the same format as the conversion table LD1 to LF.
global_output_path
By default object of type NULL
but object of type character
. Path of the global outputs directory. The function will create subsection if necessary.
By default NULL, for no outputs extraction. Outputs will be extracted, only if a global_output_path is specified.
referential_template
Object of class character
expected. By default "observe". Referential template selected (for example regarding the activity_code). You can switch to "avdth".
If a global_output_path is specified, the following output is extracted and saved in ".csv" format under the path: "global_output_path/level2/data/".
process_2_1: a table (.csv) with as many rows as elementary samples raw, plus the elementary samples raw created by certain conversions from LD1 TO LF classes, and 16 columns:
full_trip_id: retained full trip id, type integer
.
full_trip_name: full trip id, type integer
.
trip_id: trip identification (unique topiaid from database), type character
.
trip_end_date: trip end date, type character
.
year_trip_end_date: year of trip end, type integer
.
vessel_code: vessel code, type integer
.
vessel_type_code: vessel type code, type integer
.
well_id: well identification (unique topiaid from database (ps_logbook.well in Observe)), type character
.
sample_id: sample identification (unique topiaid from database (ps_logbook.sample in Observe)), type character
.
sub_sample_id: sub-sample identification number, type integer
.
elementarysampleraw_id: elementarysampleraw identification (unique topiaid from database (ps_logbook.samplespeciesmeasure in Observe)), type character
.
species_fao_code: species FAO code, type character
.
sample_length_class: sample length class (cm) of measured individuals in first dorsal length (LD1), type numeric
.
sample_number_measured: sample number of measured individuals in first dorsal length (LD1), type integer
.
sample_length_class_lf: sample length class (cm) of measured individuals converted in curved fork length (LF), type numeric
.
sample_number_measured_lf: sample number of measured individuals converted for curved fork length (LF) distribution, type numeric
.
For example, for one sample (sample_number_measured=1
) from the Atlantic Ocean (1), of the species YFT (Thunnus albacares), with a first dorsal length class (LD1) measured at sample_length_class=8
(cm),
the LD1 to LF conversion will create a new elementary sample row because a percentage of ratio=50
% of the number of fish in the sample will be assigned to the curved fork length class: sample_number_measured_lf=32
(cm) ( sample_number_measured_lf=0.5
)
and 50% of this sample will obtain a sample_number_measured_lf=34
(cm) with sample_number_measured_lf=0.5
.
sample_number_measured_extrapolation()
Process for sample number measured individuals extrapolation to sample number individuals counted. In fact, during the sampling and according to the protocol, just a part of the sampled individuals are measured in relation to that counted. The aim of this step is to extrapolate the number of individuals measured in the sample to the number of individuals counted in the sample. To do that, a Raising Factor (RF4) is calculated per stratum, per well, per sample, per sub-sample and per species. It is equal, by stratum, to the sum of each counted individuals divided by the sum of measured individuals (after conversion of measure in curved fork length in the process 2.1).
If a global_output_path is specified, the following output is extracted and saved in ".csv" format under the path: "global_output_path/level2/data/".
process_2_2: a table (.csv) with as many rows as elementary samples raw, plus the elementary samples raw created by certain conversions from LD1 TO LF classes, and 17 columns:
full_trip_id: retained full trip id, type integer
.
full_trip_name: full trip id, type integer
.
trip_id: trip identification (unique topiaid from database), type character
.
trip_end_date: trip end date, type character
.
year_trip_end_date: year of trip end, type integer
.
vessel_code: vessel code, type integer
.
vessel_type_code: vessel type code, type integer
.
well_id: well identification (unique topiaid from database (ps_logbook.well in Observe)), type character
.
sample_id: sample identification (unique topiaid from database (ps_logbook.sample in Observe)), type character
.
sub_sample_id: sub-sample identification number, type integer
.
sub_sample_total_count_id: sub sample identification bis in relation with the fish total count (unique topiaid from database (ps_logbook.samplespecies in Observe)), type character
.
elementarysampleraw_id: elementarysampleraw identification (unique topiaid from database (ps_logbook.samplespeciesmeasure in Observe)), type character
.
species_fao_code: species FAO code, type character
.
sample_length_class_lf: sample length class (cm) of measured individuals converted in curved fork length (LF), type numeric
.
sample_number_measured_lf: sample number of measured individuals converted for curved fork length (LF) distribution, type numeric
.
sample_total_count: total number of individuals counted for this sample, type integer
.
sample_number_measured_extrapolated_lf: sample number of measured individuals (converted in LF) extrapolated to the sample number of counted individuals, type numeric
.sample_number_measured_extrapolated_lf=sample_number_measured_lf x rf4
.
sample_length_class_step_standardisation()
Process for step standardisation of lf length class. This step aims to standardize sample length classes. So far, these specifications are integrate in the process:
a length classes step of 1cm for: SKJ (Katsuwonus pelamis), LTA (Euthynnus alletteratus) and FRI (Auxis thazard),
a length classes step of 2cm fo: YFT (Thunnus albacares), BET (Thunnus obesus) and ALB (Thunnus alalunga).
To standardize the original sample's curved fork length (LF), the object "elementarysample" is created by aggregation of elementary sample raw.
full_trips$sample_length_class_step_standardisation(
maximum_lf_class = as.integer(500),
global_output_path = NULL
)
maximum_lf_class
Object of type integer
expected. Theoretical maximum lf class that can occur (all species considerated). By default 500.
global_output_path
By default object of type NULL
but object of type character
. Path of the global outputs directory.The function will create subsection if necessary.
By default NULL, for no outputs extraction. Outputs will be extracted, only if a global_output_path is specified.
If a global_output_path is specified, the following output is extracted and saved in ".csv" format under the path: "global_output_path/level2/data/".
process_2_3: a table (.csv) with as many rows as elementary samples, and 17 columns:
full_trip_id: retained full trip id, type integer
.
full_trip_name: full trip id, type integer
.
trip_id: trip identification (unique topiaid from database), type character
.
trip_end_date: trip end date, type character
.
year_trip_end_date: year of trip end, type integer
.
vessel_code: vessel code, type integer
.
vessel_type_code: vessel type code, type integer
.
well_id: well identification (unique topiaid from database (ps_logbook.well in Observe)), type character
.
sample_id: sample identification (unique topiaid from database (ps_logbook.sample in Observe)), type character
.
sample_type_code: sample type type code, type integer
.
sample_quality_code: sample quality code, type integer
.
sub_sample_id: sub-sample identification number, type integer
.
species_fao_code: species FAO code, type character
.
sample_total_count: total number of individuals counted for this sample, type integer
.
sample_standardised_length_class_lf: standardised sample length class (cm) in curved fork length (LF), according to the species and step associated, type numeric
.
sample_number_measured_extrapolated_lf: standardised sample number of measured individuals (converted in LF and extrapolated in step 2.2, type numeric
) .
well_set_weight_categories()
Process for well set weight categories definition.
The sampling provides information at the well scale. However, a set can be split between several wells and the individuals sampled could belong to as many sets as there are in the well.
This process aim to compute a weighted weight, which represents the weight of a set in a well, according to the distribution of this set in all the wells.
The overall formula is as follows:
\(WW = \frac{W1}{W2} \times WT\), where:
WW: is the weighted weight,
W1: is the weight of the set in the well,
W2: is the weight of the set in all the sampled wells,
WT: the total set's weight.
So far, the process is developed for the purse seiner.
Furthermore, a proportion of each sampling sets among the sampling well will be calculated in relation with the weighted weight:
\(PWW = \frac{WW_{i,j}}{\sum_{i=1}^{n} WW_{i,j}}\), where:
PWW: is the proportional weighted weight,
\(WW_{i,j}\): is the weighted weight of the current set i in well j.
full_trips$well_set_weight_categories(
sample_set,
global_output_path = NULL,
referential_template = "observe"
)
sample_set
Object of type data.frame
expected. Data frame object with weighted weigh of each set sampled.
global_output_path
By default object of type NULL
but object of type character
. Path of the global outputs directory. The function will create subsection if necessary.
By default NULL, for no outputs extraction. Outputs will be extracted, only if a global_output_path is specified.
referential_template
Object of class character
expected. By default "observe". Referential template selected (for example regarding the activity_code). You can switch to "avdth".
If a global_output_path is specified, the following output is extracted and saved in ".csv" format under the path: "global_output_path/level2/data/".
process_2_4: a table (.csv) with as many rows as elementary samples, and 12 columns:
full_trip_id: retained full trip id, type integer
.
full_trip_name: full trip id, type integer
.
trip_id: trip identification (unique topiaid from database), type character
.
trip_end_date: trip end date, type character
.
year_trip_end_date: year of trip end, type integer
.
vessel_code: vessel code, type integer
.
vessel_type_code: vessel type code, type integer
.
well_id: well identification (unique topiaid from database (ps_logbook.well in Observe)), type character
.
activity_id: activity identification (unique topiaid from database (ps_logbook.activity in Observe)), type character
.
weighted_weight_minus10: weighted catch weight of individuals in the less than 10 tonnes category (by well, in tonnes, considering all species), type numeric
.
weighted_weight_plus10: weighted catch weight of individuals in the over 10 kg category (by well, in tonnes, considering all species), type numeric
.
weighted_weight: weighted catch weight (WW) of individuals (less and more 10kg categories, by well, in tonnes, considering all species), which represents the weight of a set in a well, type numeric
.
To better understand what the process does, let's look at an example:
A set of 90 tonnes is display in 3 wells, 40 tonnes in the first one, 30 tonnes in the second and 20 tonnes in the last one.
The wells 2 and 3 were sampled but no the first one. For the second well, the weighted weight will be equal to 54 tonnes (30 / 50 x 90).
For the third one, the weighted weight will be equal to 36 tonnes (20 / 50 x 90).
standardised_sample_creation()
Object standardised sample creation. This process aims to sum up the samples according to the update made from the processes 2.1 to 2.3 on sample data. In this step we left behind all the notions of subsamples and we take into account the new-sample creation in the step above (for example when we make the conversion to LD1 to LF). This step create a new object called standardized sample expressed at the scale of the trip, the well, the sample (id, quality and type) and the species.
If a global_output_path is specified, the following output is extracted and saved in ".csv" format under the path: "global_output_path/level2/data/".
process_2_5: a table (.csv) with as many rows as standardized samples, and 12 columns:
full_trip_id: retained full trip id, type integer
.
full_trip_name: full trip id, type integer
.
trip_id: trip identification (unique topiaid from database), type character
.
trip_end_date: trip end date, type character
.
year_trip_end_date: year of trip end, type integer
.
vessel_code: vessel code, type integer
.
vessel_type_code: vessel type code, type integer
.
well_id: well identification (unique topiaid from database (ps_logbook.well in Observe)), type character
.
sample_id: sample identification (unique topiaid from database (ps_logbook.sample in Observe)), type character
.
species_fao_code: species FAO code, type character
.
sample_standardised_length_class_lf: standardized sample length class in curved fork length (LF) (cm), type numeric
.
sample_number_measured_extrapolated_lf: standardized sample number of measured individuals (converted in LF and extrapolated to all counted individuals), type numeric
.
standardised_sample_set_creation()
R6 object standardised sample set creation.
In the previous processes and in the object standardized sample associated, samples is expressed at the well scale.
In this step, the aim is to move to the expression of sample by well to sample by set.
In the process 2.4, a weighted weight (WW) and a proportion of this weighted weight (PWW) at the set scale, has been calculated.
By combination of this value and elements of the object standardized sample, a new object called standardized sample set was created. Like explain before, this object is the expression of the sample at the set scale.
Furthermore, this process made a conversion of the samples length measurements in weight by length weight relationships (LWR).
LWR formulas take the form: \(RWT=a \times LF^b\), where:
RWT: is the round weight (kg),
LF: is the curved fork length (cm),
parameters a and b comes from a references table as the Referential LWR table and are dependent of of the species and potentially of the area (ocean or others) and the season.
More detail information could be find on the regional fisheries management organisations (RFMOs) like ICCAT or IOTC.
full_trips$standardised_sample_set_creation(
length_weight_relationship_data,
global_output_path = NULL
)
length_weight_relationship_data
Object of type data.frame
or tbl_df
expected.
Data frame object with parameters for length weight relationships, in the same format as the Referential LWR table.
global_output_path
By default object of type NULL
but object of type character
.
Path of the global outputs directory. The function will create subsection if necessary.
By default NULL, for no outputs extraction. Outputs will be extracted, only if a global_output_path is specified.
If a global_output_path is specified, the following output is extracted and saved in ".csv" format under the path: "global_output_path/level2/data/".
process_2_6: a table (.csv) with as many rows as , and 15 columns:
full_trip_id: retained full trip id, type integer
.
full_trip_name: full trip id, type integer
.
trip_id: trip identification (unique topiaid from database), type character
.
trip_end_date: trip end date, type character
.
year_trip_end_date: year of trip end, type integer
.
vessel_code: vessel code, type integer
.
vessel_type_code: vessel type code, type integer
.
well_id: well identification (unique topiaid from database (ps_logbook.well in Observe)), type character
.
sample_id: sample identification (unique topiaid from database (ps_logbook.sample in Observe)), type character
.
species_fao_code: species FAO code, type character
.
sample_standardised_length_class_lf: standardized sample length class in curved fork length (LF) (cm), type numeric
.
sample_number_weighted: sample number of measured individuals weighted by set weight, after conversion in LF and extrapolation to all counted individuals, type numeric
.sample_number_weighted = sample_number_measured_extrapolated_lf * PWW
.
sample_weight_unit: weight (kg) of one individual,
calculated using length weight relationships as in Referential LWR table: sample_weight_unit = parameter_a * sample_standardised_length_class_lf ^ parameter_b
),
type numeric
.
sample_weight: weight (kg) of all measured individuals weighted by set weight, after conversion in LF and extrapolation to all counted individuals, type numeric
.sample_weight = sample_weight_unit * sample_number_weighted
.
sample_category: sample category ("-10kg" or "+10kg"), according to sample_weight_unit
value, type character
.
raised_factors_determination()
Raised factors determination for weight sample set to set. This step aims to check relevance of the object standardized sample set by calculation of 6 parameters at the scale of each well sets, if it's possible regarding data available:
number and weight of sampled individuals, total and by weight categories (\(\leq\) 10kg or > 10kg):
weighted_samples_minus10
: sum of sample_weight
for sample weight category \(\leq\) 10 kg, from standardised_sample_set
object,
created at step 2.6.
weighted_samples_plus10
: sum of sample_weight
for sample weight category > 10 kg, from standardised_sample_set
object,
created at step 2.6.
weighted_samples_total
: sum of sample_weight
for all sample weight categories (\(\leq\) 10 kg) and > 10kg, from standardised_sample_set
object,
created at step 2.6.
three raising factors are calculated related to the weighted weight of the set (calculated at step 2.4 and weight of sampled individuals, total and by weight categories (\(\leq\) 10kg and > 10kg):
rf_minus10= weighted_weight_minus10 / weighted_samples_minus10
rf_plus10= weighted_weight_plus10 / weighted_samples_plus10
rf_total=weighted_weight / weighted_samples_total
The verification thresholds can be modified in the function parameters using the following arguments:
threshold_rf_minus10
: by default at 500,
threshold_rf_plus10
: by default at 500,
threshold_frequency_rf_minus10
: by default at 75,
threshold_frequency_rf_plus10
: by default at 75,
threshold_rf_total
: by default at 250.
full_trips$raised_factors_determination(
threshold_rf_minus10 = as.integer(500),
threshold_rf_plus10 = as.integer(500),
threshold_frequency_rf_minus10 = as.integer(75),
threshold_frequency_rf_plus10 = as.integer(75),
threshold_rf_total = as.integer(250),
global_output_path = NULL
)
threshold_rf_minus10
Object of type integer
expected. Threshold limit value for raising factor on individuals category minus 10. By default 500.
threshold_rf_plus10
Object of type integer
expected. Threshold limit value for raising factor on individuals category plus 10. By default 500.
threshold_frequency_rf_minus10
Object of type integer
expected. Threshold limit frequency value for raising factor on individuals category minus 10. By default 75.
threshold_frequency_rf_plus10
Object of type integer
expected. Threshold limit frequency value for raising factor on individuals category plus 10. By default 75.
threshold_rf_total
Object of type integer
expected. Threshold limit value for raising factor (all categories). By default 250.
global_output_path
By default object of type NULL
but object of type character
.
Path of the global outputs directory. The function will create subsection if necessary.
By default NULL, for no outputs extraction. Outputs will be extracted, only if a global_output_path is specified.
If a global_output_path is specified, the following output is extracted and saved in ".csv" format under the path: "global_output_path/level2/data/".
process_2_7: a table (.csv) with as many rows as , and 13 columns:
full_trip_id: retained full trip id, type integer
.
full_trip_name: full trip id, type integer
.
trip_id: trip identification (unique topiaid from database), type character
.
trip_end_date: trip end date, type character
.
year_trip_end_date: year of trip end, type integer
.
vessel_code: vessel code, type integer
.
vessel_type_code: vessel type code, type integer
.
well_id: well identification (unique topiaid from database (ps_logbook.well in Observe)), type character
.
activity_id: activity identification (unique topiaid from database (ps_logbook.activity in Observe)), type character
.
weighted_samples_minus10: weight of sampled individuals (tonnes), for weight category (\(\leq\) 10kg, type numeric
.
weighted_samples_plus10: weight of sampled individuals (tonnes), for weight category > 10kg, type numeric
.
weighted_samples_total: weight of sampled individuals (tonnes), for all weight categories (\(\leq\) 10kg and > 10kg), type numeric
.
rf_validation: raising factor status, type integer
.
rf_validation_label: raising factor status label, type character
.
raised_standardised_sample_set()
Application of process 2.8 raised factors on standardised sample set. This last step aim to express number and weight of sampled individuals at the scale of the set. The process use the factors calculated in the process 2.7.
If a global_output_path is specified, the following output is extracted and saved in ".csv" format under the path: "global_output_path/level2/data/".
process_2_8: a table (.csv) with as many rows as , and 14 columns:
full_trip_id: retained full trip id, type integer
.
full_trip_name: full trip id, type integer
.
trip_id: trip identification (unique topiaid from database), type character
.
trip_end_date: trip end date, type character
.
year_trip_end_date: year of trip end, type integer
.
vessel_code: vessel code, type integer
.
vessel_type_code: vessel type code, type integer
.
well_id: well identification (unique topiaid from database (ps_logbook.well in Observe)), type character
.
activity_id: activity identification (unique topiaid from database (ps_logbook.activity in Observe)), type character
.
sample_id: sample identification (unique topiaid from database (ps_logbook.sample in Observe)), type character
.
species_fao_code: species FAO code, type character
.
sample_standardised_length_class_lf: standardized sample length class in curved fork length (LF), (cm), type numeric
.
sample_number_weighted_set: sample number weighted by set, type numeric
. sample_number_weighted_set = sample_number_weighted * rf
, where :
rf
: is one of the raising factors calculated in the process 2.7,
according to sample weight category (\(\leq\) 10kg and > 10kg).
sample_number_weighted
: is the sample number of measured individuals weighted by set weight, after conversion in LF and extrapolation to all counted individuals,
calculated in the process 2.6.
sample_weight_set: sample weight by set (tonnes), type numeric
.sample_weight_set = sample_weight_unit/1000 * sample_number_weighted_set
path_to_level3()
Temporary link to the R object model with modelling level 3 process.
If a global_output_path is specified, the following output is extracted and saved in ".RData" format under the path: "global_output_path/".
"inputs_level3_target_year
_ocean_ocean_code
_country_(ies)
": a list of 5 data.frame
:
act: a data.frame with as many rows as activities and 10 column:
id_act: activity identification (unique topiaid from database), type character
.
lat: activity latitude, type numeric
.
lon: activity longitude, type numeric
.
fmod: fishing school type code, type integer
.
In Observe referential template: 1 for floating object school, 2 for free school and 0 for undetermined school.
date_act activity date, type POSIXct
.
vessel: vessel identification code, type integer
.
flag_code: flag of the vessel, three letters country ISO 3 code(s), type character.
id_trip: trip identification (unique topiaid from database), type character
.
ocean: ocean code, type integer
.
For example ocean_code=1
for the Atlantic Ocean and ocean_code=2
the Indian Ocean.
code_act_type: activity code to define the type of activity, type integer
.
act3: a data.frame with as many rows as elementary catches and 8 columns:
id_act : activity identification (unique topiaid from database), type character
.
w_lb_t3 : catch weight (tonnes) after weight category conversion in step 1.2, type numeric
.
sp_fate_code : species fate codes, type integer
.
For example in Observe database :
4 : discarded alive.
5 : discarded dead.
6 : Retained, presumably destined for the cannery
8 : used for crew consumption on board.
11 : discarded status unknown (only for EMS and logbook).
15 : retained for local market or dried/salted fish on board.
sp : species FAO code, type character
.
count : catch count, type integer
.
wcat: weight category after conversion to standard categories (<10kg and > 10kg or >10kg, 10-30kg and >30kg), in step 1.2, type character
.
date_act: activity date, type POSIXct
.
code_act_type: activity code to define the type of activity, type integer
.
samw: a data.frame with 4 columns:
id_act: activity identification (unique topiaid from database), type character
.
sp: species FAO code, type character
.
wcat: sample weight category (less than 10kg: "-10kg" or more than 10 kg: "+10kg"), type character
.
Output named sample_category
from step 2.6.
w_fit_t3: sample weight by set (activity) in tonnes, type numeric
.
Output named sample_weight_set
from step 2.8
sset: a data.frame with 4 columns:
id_act: activity identification (unique topiaid from database), type character
.
id_sample: sample identification (unique topiaid from database), type character
.
quality sample quality identification code, type integer.
For example in Observe referential, a sample with quality=1
corresponds to a "Super-T3" sample and the quality=3
is for "Biology" sample.
type sample type identification code, type integer.
For example in Observe referential, a sample with type=1
corresponds to a "At landing" sample.
wp: a data.frame with 6 columns:
id_well well identification (unique topiaid from database), type character
.
id_act: activity identification (unique topiaid from database), type character
.
id_sample: sample identification (unique topiaid from database), type character
.
code3l: species FAO code, type character
.
weight: well plan's weight declared (tonnes), type numeric
.
wcat_well: well's category declared (less than 10kg: "-10kg" or more than 10 kg: "+10kg"), type character
.
@aliases path_to_level3
a list of 5 data.frame
or tbl_df
process_level3$raw_inputs_level3$
:
act
: a data table recording the activities ID, date, type, coordinates, fishing school type and associated trip ID, as well as the vessel, ocean and flag codes.
act3
: a data table recording the activities ID, date and type and associated catches (species, species fate code, weight, count and standard weight category).
samw
: a data table recording the samples species, weight and weight category by activity (ID).
sset
: a data table recording the samples ID, type and quality by activity (ID).
wp
: a data table recording the well plan, i.e. the well ID and corresponding activities and samples ID with associated species, sample weight and well's weight category (+10kg or -10kg).
data_preparatory()
Data preparatory for the t3 modelling process (level 3).
full_trips$data_preparatory(
inputs_level3 = NULL,
inputs_level3_path = NULL,
output_directory,
periode_reference_level3 = NULL,
target_year = as.integer(lubridate::year(Sys.time() - 1)),
period_duration = 4L,
target_ocean = NULL,
distance_maximum = as.integer(5),
number_sets_maximum = as.integer(5),
set_weight_minimum = as.integer(6),
minimum_set_frequency = 0.1,
vessel_id_ignored = NULL
)
inputs_level3
Object of type data.frame
expected. Inputs of levels 3 (see function path to level 3).
inputs_level3_path
Object of type character
expected. Path to the folder containing yearly data output of the level 1 and 2 (output of the function the path to level 3). If provide, replace the inputs_level3 object.
output_directory
Object of type character
expected. Path of the outputs directory.
periode_reference_level3
Object of type integer
expected. Year(s) period of reference for modelling estimation.
target_year
Object of type integer
expected. Year of interest for the model estimation and prediction.Default value is current year -1.
period_duration
Object of type integer
expected. number of years use for the modelling. The default value is 5
target_ocean
Object of type integer
expected. The code of ocean of interest.
distance_maximum
Object of type integer
expected. Maximum distance between all sets of a sampled well. By default 5.
number_sets_maximum
Object of type integer
expected. Maximum number of sets allowed in mixture. By default 5.
set_weight_minimum
Object of type integer
expected. Minimum set size considered. Remove smallest set for which sample could not be representative. By default 6 t.
minimum_set_frequency
Object of type numeric
expected. Minimum threshold proportion of set in a well to be used for model training in the process. By default 0.1.
vessel_id_ignored
Object of type integer
expected. Specify list of vessel(s) id(s) to be ignored in the model estimation and prediction .By default NULL.
random_forest_models()
Modelling proportions in sets througth random forest models.
full_trips$random_forest_models(
output_level3_process1,
num.trees = 1000L,
mtry = 2L,
min.node.size = 5,
seed_number = 7L,
small_fish_only = FALSE
)
output_level3_process1
Object of type data.frame
expected. Output table data_lb_sample_screened from process 3.1.
num.trees
Object of type integer
expected. Number of trees to grow. This should not be set to too small a number, to ensure that every input row gets predicted at least a few times. The default value is 1000.
mtry
Object of type integer
expected. Number of variables randomly sampled as candidates at each split. The default value is 2.
min.node.size
Object of type numeric
expected. Minimum size of terminal nodes. Setting this number larger causes smaller trees to be grown (and thus take less time).The default value is 5.
seed_number
Object of type integer
expected. Set the initial seed for the modelling. The default value is 7.
small_fish_only
Object of type logical
expected. Whether the model estimate proportion for small fish only (< 10 kg).
models_checking()
Load each full model and compute figure and tables to check the model quality. Furthermore, create a map of samples used for each model and relationship between logbook reports and samples.
full_trips$models_checking(
output_level3_process2,
output_directory,
plot_sample = FALSE,
avdth_patch_coord = FALSE
)
output_level3_process2
Object of type list
expected. Outputs models and data from process 3.2.
output_directory
Object of type character
expected. Outputs directory path.
plot_sample
logical
. Whether the sample figure is computed. Default value = F
avdth_patch_coord
parameter waiting for coordinate conversion patch from avdth database
data_formatting_for_predictions()
Formatting data for model predictions.
full_trips$data_formatting_for_predictions(
inputs_level3,
output_level3_process1,
target_year,
vessel_id_ignored = NULL,
country_flag = NULL,
input_type = "observe_database",
small_fish_only = FALSE
)
inputs_level3
Object of type data.frame
expected. Inputs of levels 3 (see function path to level 3).
output_level3_process1
Object of type data.frame
expected. Output table data_lb_sample_screened from process 3.1.
target_year
Object of type integer
expected. The year of interest for the model estimation and prediction.
vessel_id_ignored
Object of type integer
expected. Specify here vessel(s) id(s) if you want to ignore it in the model estimation and prediction .By default NULL.
country_flag
Three letters FAO flag code of country to estimate catches.
input_type
Type of coding use in different databases. Default value is 'observe_database'. Values can be 'observe_database' or 'avdth_database'.
small_fish_only
Object of type logical
expected. Whether the model estimate proportion for small fish only (< 10 kg).
model_predictions()
Model predictions for the species composition and computing of catches.
full_trips$model_predictions(
output_level3_process2,
output_level3_process4,
output_directory,
country_flag = NULL,
ci = FALSE,
ci_type = "all",
Nboot = 50,
plot_predict = FALSE
)
output_level3_process2
Object of type list
expected. Outputs from level 3 process 2 (random forest models).
output_level3_process4
Object of type list
expected. Outputs from level 3 process 4 (data formatting for predictions).
output_directory
Object of type character
expected. Outputs directory path.
country_flag
Three letters FAO flag code of country to estimate catches.
ci
Object of type logical
expected. Logical indicating whether confidence interval is computed. The default value is FALSE as it is a time consuming step.
ci_type
Type of confidence interval to compute. The default value is "all". Other options are "set" for ci on each set, "t1" for ci on nominal catch by species, "t1-fmod" for ci on nominal catch by species and fishing mode "t2" and "t2-fmod" for ci by 1 degree square and month. A vector of several ci option can be provided. ci_type are computed only if the ci parameter is TRUE.
Nboot
Object of type numeric
expected. The number of bootstrap samples desired for the ci computation. The default value is 10.
plot_predict
Object of type logical
expected. Logical indicating whether maps of catch at size have to be done.
show_me_what_you_got()
Most powerfull and "schwifty" function in the univers for "open the T3 process" and manipulate in live R6 objects.
## ------------------------------------------------
## Method `full_trips$path_to_level3`
## ------------------------------------------------
if (FALSE) { # \dontrun{
process_level3 <- object_full_trips$path_to_level3(global_output_path = final_output_path)
} # }