# An reasonably priced and easy-to-use instrument for automated fish size and weight estimation in mariculture

### Farming web site traits

The research was carried out on the “Maricoltura e Ricerca Società Cooperativa” fish farm (43°03′34.0″N 9°50′19.4″E) roughly 0.17 nautical miles off Capraia Island (Tuscany Area, Northern Tyrrhenian Sea, Italy). The farming web site space is characterised by a rocky backside, water depth of about 35–40 m, dissolved oxygen focus of 5.95 ± 0.30 mg/L (imply ± SD), and annual floor water temperature of 24.13 ± 0.53 °C (imply ± SD). The power consists of ten round sea cages, eight 2400 m3 cages devoted solely to gilthead seabream and European sea bass (Dicentrarchus labrax L.) farming, and two 900 m3 cages additionally dedicated to experimental trials.

### Sensible buoy and stereoscopic digicam traits

The sensible buoy was composed of a 1.2 m × 0.2 m chrome steel cylinder, fastened to a 0.6 m vast float (Fig. 4a). The gadget was geared up with a lithium battery pack, a 4G community connection router, a multiparametric probe (measuring temperature, pH, and dissolved oxygen), and a stereo digicam (Fig. 4c). The buoy was anchored inside a industrial scale farming cage for underwater picture recording and tied by ropes to the floating collar of the cage. The built-in stereo digicam was positioned at a depth of about 0.7 m and sealed in a water-proof housing (plexiglass cylinder). The mounted digicam was an 8MP Arducam synchronized stereo digicam consisting of two 8MP IMX219 digicam modules able to taking footage concurrently because of a connection on Raspberry Pi (Desk 2). The 2-camera lenses have been spaced 8 cm aside on the vertical axis and the gadget was oriented in direction of the cage internet (Fig. 4a–d). The sensible buoy transmitted over a cellular community to a cloud-based web site the place photographs and knowledge have been saved. The photographs have been accessible for obtain on a private laptop. Through the trial interval (3 months, from 1st July to thirtieth September 2021), the fish cage hosted about 4000 gilthead seabreams (imply weight 606 ± 103 g). Every day feeding and all routine farm procedures have been carried out by the farmers’ operators throughout all the trial interval. On the finish of the experiment, 200 fish have been collected and customary size (SL) and weight (W) have been recorded to find out the size–weight relationship curves and examine the outcomes obtained from the picture evaluation to the precise dimension of the fish.

### Picture knowledge calibration and evaluation

There are a selection of approaches for geometric digicam calibration23. A lot of them have in frequent using markers or patterns, which signify seen and distinguishable object factors. These object factors and their corresponding picture factors are then used as observations to find out the parameters of the digicam mannequin(s) and the relative orientation(s)24. Within the underwater setting, the calibration should mannequin and compensate for the refractive results of lenses, the housing port, and the water medium10. Laptop imaginative and prescient approaches typically use 2D take a look at fields within the type of chessboard targets. Often, the 2D calibration approach employs a planar calibration sample of alternating black and white squares to find out the intrinsic and extrinsic parameters of the digicam25.

#### Digicam calibration and mannequin becoming

On this research, a number of replicate calibrations have been carried out within the water and in numerous orientations utilizing a chessboard (270 × 190 mm) as a planar calibration sample (Fig. 5). The corners of 15 squares have been manually marked, every sq. measuring 27 × 27 mm. On this section, the chessboard photographs have been used to estimate the digicam’s radial distortion parameters (Eq. 1, distortion matrix26), and to right the refraction brought on by the propagation of sunshine via totally different substances27. Estimating the radial distortion coefficient helped to take away the barrel and cushion results introduced in by the digicam and the housing.

$$start{gathered} hfill start{array}{*{20}c} {left[ {begin{array}{*{20}c} {f_{x} } & 0 & 0 s & {f_{y} } & 0 {c_{x} } & {c_{y} } & 1 end{array} } right]} & {start{array}{*{20}l} {frac{{left[ {c_{x} ~c_{y} } right] – {textual content{Optical~heart~}}left( {{textual content{the~principal~level}}} proper),{textual content{~in~pixels}}~}}{{~left( {f_{{x~}} ~f_{y} } proper) – ~{textual content{Focal~size~in~pixels}}}}} hfill {f_{x} = {raise0.7exhbox{F} !mathord{left/ {vphantom {F {p_{x} }}}proper.kern-nulldelimiterspace} !lower0.7exhbox{{p_{x} }}}} hfill {f_{y} = {raise0.7exhbox{F} !mathord{left/ {vphantom {F {p_{y} }}}proper.kern-nulldelimiterspace} !lower0.7exhbox{{p_{y} }}}} hfill {F – {textual content{Focal~size~in~world~items}},{textual content{~expressed~in~millimeters}}} hfill {frac{{left( {p_{{x~}} ~p_{y} } proper) – ~{textual content{Dimension~of~the~pixels~in~world~items}}}}{{{textual content{s}} – {textual content{Skew coefficient}},{textual content{ which is non}} – {textual content{zero if the picture axes aren’t perpendicular}}}}} hfill {{textual content{s}} = f_{x} tan alpha } hfill finish{array} } finish{array} hfill finish{gathered}$$

(1)

In a second section, the stereo photographs (one pair of photographs per photograph shoot) have been correctly marked with 4 pairs of reference factors (landmarks) in several positions of the board (corners) to estimate the interpretation of every pixel of the board between the 2 stereo photographs; the relative translation of the goal within the two stereo photographs is immediately associated to the space between the digicam and the goal itself (Fig. 1). The nearer the goal is to the digicam, the higher the interpretation of the goal between the 2 stereo photographs. This was key data to appropriately estimate the precise goal dimension in pixels. A complete of 52 photographs and 208 single measurements from the calibration chessboard have been used to compute the connection.

#### Measurement error estimation

In an effort to estimate the measurement error (imply absolute proportion error – MAPE), 17 photographs of plastic fish silhouettes of 4 identified totally different sizes have been taken and processed (customary size of twenty-two.0–24.2–29.0–33.6 cm every, Fig. 5). The photographs have been captured by putting the targets in entrance of the digicam, at growing distances. The identified lengths of the fish silhouettes have been in contrast with the estimated lengths of the AI to calculate the error in cm and as a proportion of the fish physique size.

#### AI automated fish recognition, landmarks positioning, and fish size measurements

Through the picture acquisition stage, the fish swam freely inside the cage, with out being oriented alongside any of the x–y axes of the digicam aircraft. For fish physique size estimation, a posh AI pipeline was designed (Fig. 6). The pipeline was cut up into smaller packages to interrupt down the ultimate pipeline process into its parts and thus simplify and handle the evaluation extra effectively.

The uncooked stereo photographs have been fed to an improved Convolutional Neural Community (CNN) known as You Solely Look As soon as (YOLO) v428. As a superb one-stage detection algorithm, the YOLO collection algorithm has excessive detection accuracy, quick detection velocity, and is extensively utilized in numerous goal detection duties. A number of research utilized this algorithm for detection functions: Tian et al.29 used improved YOLOv3 to detect apples at totally different progress levels; Shi et al.30 proposed a YOLO community pruning technique, which can be utilized as a light-weight mango detection mannequin for cellular units; and Cai et al.31 proposed an improved YOLOv3 based mostly on MobileNetv1 as a spine to detect fish. On this research, YOLOv4 CNN was skilled with 1400 correctly annotated photographs (Coaching: n = 1120, validation: n = 280), collected from the Open Picture dataset (Open Pictures Dataset V6 – storage.googleapis.com) and from the sphere, to find particular person fish inside the picture utilizing bounding bins. The coaching was carried out for 6000 iterations and reached a CIoU Loss = 1.532 and an mAP = 87%33. In a second step, every bounding field was used to acquire the person picture of the fish, which was then entered into a well known CNN, RESNET-101 (RES101)34, optimized for picture recognition. The coaching of RES101 was carried out in Pytorch35 utilizing the switch studying approach as proposed by Monkman et al.36. Furthermore, as for the person fish location, the automated landmarks detection was achieved utilizing CNN, RESNET-101, with the final layer being modified to detect two landmarks (the snout tip and the bottom of the center caudal rays) on the fish form. The coaching (n = 8960) and take a look at dataset (n = 3840) have been obtained from 200 area footage the place every related particular person fish have been extracted and manually annotated with the landmarks required. Every picture was then fed into an augmentation algorithm that generated 64 augmented photographs of various ranges of scale, noise, rotation, translation, and brightness. The method generated the ultimate picture dataset of 12,800 footage. The coaching was carried out for 100 epochs and generated an MSE = 0.23 (Imply Sq. Error between the anticipated and true landmarks. This automated landmark positioning allowed the algorithm to measure the fish size in pixels by counting the pixels between the 2 factors. Lastly, the size unit was reworked from pixels to centimeters, utilizing the interpretation data derived from the chessboard goal photographs in the course of the calibration section: the additional a goal is positioned from the cameras, the much less it interprets between a pair of stereo photographs and vice versa. The extent of this translation might be measured each as an angle (the parallax angle) or as a distance in pixels between the identical level within the two stereo photographs (Fig. 5e); the identical approach is used within the parallax technique for estimating the space of the celebrities37. Because the dimension of the chessboard was identified, the interpretation in pixels of the chessboard’s landmarks was then plotted towards their corresponding ratio between size in cm and size in pixels; the match mannequin generated has grow to be extra correct the extra stereo photographs have been examined in several positions inside the digicam’s area of view.

The commentary factors obtained by the described computation have been then entered right into a polynomial Ridge Regression algorithm which produced a perform minimizing the Residual Imply Sq. Error38. This mannequin was lastly used to estimate the micron/pixel changing issue of a given fish, via which the ultimate size in cm is obtained. The distribution of normal size values achieved using the AI algorithm (n = 124, known as “AI estimated”) was in comparison with the one immediately measured on a subsample of 190 fish (known as “Sampled”) collected in the identical cage the place stereo photographs have been taken. The fish employed on this research have been harvested by the farm workers and destined on the market in giant retailers, as they pertained to the fish farm. A random sub-sample of all catches from the cage was given to the researchers for comparative evaluation. Subsequently, we handled fish already sacrificed, consistent with the present nationwide laws for farmed animals. The size comparability was carried out utilizing a quantile–quantile (q-q) plot, i.e., a graphical approach for figuring out if two knowledge units come from populations with a standard distribution (Fig. 3). If this assumption is true, the factors within the scatterplot ought to fall roughly alongside the 45° reference line plotted within the graph. The higher the deviation from this baseline, the higher the proof that the 2 knowledge units come from populations with totally different distributions. The Shapiro–Wilk take a look at (alpha degree = 0.05) was carried out to check for the info set’s normality, whereas the Leven take a look at (alpha degree = 0.05) was carried out to evaluate the homogeneity of variance between the 2 distributions. Then, the Welch-F take a look at (alpha degree = 0.05) was used to check the distribution means within the case the homogeneity of variances was violated.

Lastly, a length-to-weight relationship (LWR)39 (Eq. 2) was decided utilizing physique weight (g) and customary size (cm) measurements from sampled fish (n = 198):

the place W is the physique weight of the fish, a is the intercept linked to physique form, L is the usual size, and b is the exponent interrelated to variations in physique form. The obtained LWR relationship was used to calculate the fish weight from size values derived from the photographs processed by the AI40(fish have been collected on the identical day the photographs have been taken by the stereo digicam).

The experimental actions involving animals performed on this research, together with their moral points, have been accepted by the Animal Welfare Physique of the CREA Centre for Animal manufacturing and aquaculture (authorization n. 86670 of 23/09/2021). No human experiments have been carried out, nor have been human tissue samples used. All of the individuals depicted within the photographs signify the authors of the research in the course of the logistical group of the calibration assessments. Knowledgeable consent was obtained from all particular person contributors each for participation within the research and for the publication of figuring out data/photographs in an internet open-access publication.