BiolCol: Technological Platform for the Dissemination of the Biological Reference Collection of Aquatic Macroinvertebrates

Caicedo-Chacón, Luz-Yamile; Barón-González, Henry-Javier; Sánchez-Pedroza, Juan-Carlos; Aguilar-Carreño, Carlos-Alberto; Vega-Otálora, Kevin-Santiago; Borja-Acosta, Kevin-Giancarlo; Caicedo-Chacón, Luz-Yamile; Barón-González, Henry-Javier; Sánchez-Pedroza, Juan-Carlos; Aguilar-Carreño, Carlos-Alberto; Vega-Otálora, Kevin-Santiago; Borja-Acosta, Kevin-Giancarlo

doi:10.19053/01211129.v32.n64.2023.15525

Servicios Personalizados

Revista

Articulo

Indicadores

Citado por SciELO
Accesos

Links relacionados

Citado por Google
Similares en SciELO
Similares en Google

Otros
Otros

Permalink

Revista Facultad de Ingeniería

versión impresa ISSN 0121-1129versión On-line ISSN 2357-5328

Rev. Fac. ing. vol.32 no.64 Tunja ene./jun. 2023 Epub 27-Ago-2023

https://doi.org/10.19053/01211129.v32.n64.2023.15525

Artículos

BiolCol: Technological Platform for the Dissemination of the Biological Reference Collection of Aquatic Macroinvertebrates

BiolCol: Plataforma tecnológica para la difusión de la colección biológica de referencia de macroinvertebrados acuáticos

BiolCol: plataforma tecnológica para divulgação da coleção de referência biológica de macroinvertebrados aquáticos

Luz-Yamile Caicedo-Chacón¹
http://orcid.org/0000-0002-2946-2298

Henry-Javier Barón-González²
http://orcid.org/0000-0002-4462-3133

Juan-Carlos Sánchez-Pedroza³
http://orcid.org/0000-0002-6455-6097

Carlos-Alberto Aguilar-Carreño⁴
http://orcid.org/0000-0002-7415-9327

Kevin-Santiago Vega-Otálora⁵
http://orcid.org/0000-0002-4480-2123

Kevin-Giancarlo Borja-Acosta⁶
http://orcid.org/0000-0002-8159-1871

^¹M. Sc. Fundación Universitaria de San Gil (San Gil-Santander, Colombia). lcaicedo@unisangil.edu.co

^²Fundación Universitaria de San Gil (San Gil-Santander, Colombia). hbaron@unisangil.edu.co

^³Fundación Universitaria de San Gil (San Gil-Santander, Colombia). juansanchezp@unisangil.edu.co

^⁴Fundación Universitaria de San Gil (San Gil-Santander, Colombia). carlosaguilar@unisangil.edu.co

^⁵Fundación Universitaria de San Gil (San Gil-Santander, Colombia). kvega@unisangil.edu.co

^⁶ Instituto de Investigación de Recursos Biológicos Alexander von Humboldt (Villa de Leyva-Boyacá, Colombia). kborja@humboldt.org.co

Abstract

UNISANGIL has one of Colombia's five collections of aquatic macroinvertebrates, being the only one in the Department of Santander. The bioinformatics platform "Biological Collection - BiolCol" seeks to digitize the biological reference collection of macroinvertebrates of UNISANGIL-CBMUS. The design of BioCol was led by three students and the coordinator of the GIBD-SI of the UNISANGIL Systems Engineering Program. The work methodology applied in the development process was Scrum. Additionally, we followed the method Sampieri proposed to identify the needs, establish the objectives, define the scope, formulate the projected activities over time, and define the deliverables for developing the application. Based on the Darwin Core standard used by the Colombian biodiversity information system (SiB Colombia), we designed the system architecture, built models, and designed the database. To develop the interface, we used Laravel for the back end and NodeJS for the front end, and we integrated the images and 3D models of those specimens with available information. The platform allows the online registration, storage, and visualization of information CBMUS collection specimens. It also generates a label with a QR code for each individual, which enables queries of the data stored for each specimen. Overall, this platform aims to facilitate the registration, administration, and dissemination of biological collection information for researchers and the community in general.

Keywords: aquatic macroinvertebrates; bioinformatics platform; biological collection; SCRUM; software

Resumen

UNISANGIL posee una de las 5 colecciones de macroinvertebrados acuáticos del país y es la única en el departamento de Santander, a continuación, se describe el proceso de desarrollo de una plataforma bioinformática llamada "Biological Collection - BiolCol" por parte de tres estudiantes y la coordinadora del semillero GIBD-SI del programa de Ingeniería de Sistemas de UNISANGIL, con el objetivo de digitalizar la colección biológica de referencia de macroinvertebrados de UNISANGIL-CBMUS. La metodología de trabajo aplicada en el proceso de desarrollo fue SCRUM, adicionalmente se utilizó la metodología propuesta por el Sampieri para identificar la necesidad, establecer los objetivos, definir el alcance, formular las actividades proyectadas en el tiempo y definir los entregables del proyecto. Se diseñó la arquitectura del sistema, se construyeron modelos y se diseñó la base de datos tomando como referente el estándar Darwin Core utilizado por el sistema de información de biodiversidad de Colombia (SiB Colombia). Para el desarrollo de la interfaz de usuario se utilizaron las herramientas Laravel para el backend y NodeJS para el frontend, se integraron las imágenes y los modelos 3D de los especímenes que tenían la información disponible. La plataforma permite el registro, almacenamiento y visualización de información de los especímenes de la colección CBMUS a través de Internet y genera una etiqueta con código QR para cada individuo que permite la consulta de la información almacenada sobre cada macroinvertebrado registrado. Esta plataforma tiene como objetivo facilitar el registro, la administración y la divulgación de la información de la colección biológica, siendo muy importante para investigadores y la comunidad en general.

Palabras clave: colección biológica; macroinvertebrados acuáticos; plataforma bioinformática; SCRUM; software

Resumo

A UNISANGIL possui uma das 5 coleções de macroinvertebrados aquáticos do país e é a única no departamento de Santander. O processo de desenvolvimento de uma plataforma de bioinformática denominada "Coleção Biológica - BiolCol" por três alunos e o coordenador é descrito a seguir. Sementeira GIBD-SI do programa de Engenharia de Sistemas UNISANGIL, com o objetivo de digitalizar a coleção de referências biológicas de macroinvertebrados UNISANGIL-CBMUS. A metodologia de trabalho aplicada no processo de desenvolvimento foi SCRUM, adicionalmente foi utilizada a metodologia proposta por Sampieri para identificar a necessidade, estabelecer os objetivos, definir o escopo, formular as atividades projetadas ao longo do tempo e definir as entregas do projeto. Desenhou-se a arquitetura do sistema, construíram-se os modelos e desenhou-se a base de dados tomando como referência o padrão Darwin Core utilizado pelo sistema de informação da biodiversidade colombiano (SiB Colombia). Para o desenvolvimento da interface do usuário, foram utilizadas as ferramentas Laravel para o backend e NodeJS para o frontend, foram integradas as imagens e modelos 3D dos espécimes que continham as informações disponíveis. A plataforma permite o registo, armazenamento e visualização da informação dos espécimes do acervo do CBMUS através da Internet e gera uma etiqueta com um QR code para cada indivíduo que permite a consulta da informação armazenada sobre cada macroinvertebrado registado. O objetivo desta plataforma é facilitar o registro, administração e divulgação de informações de coleções biológicas, sendo muito importante para pesquisadores e comunidade em geral.

Palavras-chave: coleção biológica; macroinvertebrados aquáticos; plataforma de bioinformática; programas; SCRUM

I. INTRODUCTION

Biological collections contain important information on Colombian biodiversity and are a national heritage. Through them, knowledge can be generated in the management, conservation and sustainable use of biodiversity. They are a cumulative record of knowledge generated at a specific time and place, allowing us to observe our territory's natural wealth over time [¹]. Colombia is a megadiverse country where the description of new species will greatly affect the global knowledge of biodiversity [²]. However, it is facing an accelerated rate of extinction of species, where the loss of species is greater than the knowledge about them [³].

Carrying out a complete inventory of biodiversity implies describing unknown and known species, understanding the diversity of biology and creating databases to handle information systematically. Computer networks are also created for the flow of information. All this is important due to the lack of knowledge about the state of biodiversity in a region [⁴]. This research project seeks to solve the lack of information management in the UNISANGIL - CBMUS Macroinvertebrate Biological Reference Collection. Instead of addressing all aspects related to the biology of the species, the focus is on the technical aspects of the tool to support macroinvertebrate research.

The application allows knowing the characteristics of macroinvertebrates and recording data for researchers to estimate biochemical, ecological and biotic factors. Solutions such as ecosystem bioremediation can also be implemented. This technological development seeks to create an updatable and reliable database so that researchers, universities, research institutes and decision makers can share investigative processes.

Bioinformatics is dedicated to applying informatics in managing information related to biological and medical data through the collection, storage, organization, analysis, manipulation, presentation and distribution of information. Its objective is to act as a bridge between data and information, allowing to obtain knowledge about biological processes from data analysis [⁵]. Currently, technology provides important preservation tools that help to save an original in a standard format that does not depend on a special technology for its recovery. This tool offers the opportunity to preserve the original by providing access to the digital substitute and separating the informational content from the degradation of the physical medium [⁶]. Historical documents are a precious asset for the institutions that protect them, and it is common to seek their preservation and conservation to keep them in good condition. These archives are of great value and often unavailable to the general public [⁷].

Digitization consists of converting an image to binary code with the help of a computer, providing a detailed description of the objects in the scene. Each pixel is assigned a tonal value represented by a binary code, and these binary digits are stored and often reduced to a mathematical representation. The computer then interprets the bit stream to reproduce an analog version that can be displayed or printed [⁸]. Biological collections are an important store of information on biodiversity. The extended specimen preserves the secrets of the natural world and the memory of the environment.

Given the global difficulties in the conservation of species, collections are becoming increasingly important since biological records and specimen information have helped categorise threats to endangered species [⁹]. The National Registry of Biological Collections (RNC) collects and disseminates basic information on biological collections in the country, which makes its invaluable natural heritage visible. This information is essential for the management of biodiversity and ecosystem services [¹⁰].

The Darwin Core Standard (DwC) was developed by the Biodiversity Information Standards community and is used by the Biodiversity Information System in Colombia. This standard allows for the compilation of biodiversity information from different sites and is used by most of the species presence records available on GBIF.org. The related file format is called Darwin Core Archive (DwC-A) and allows data providers to share their information using a common lexicon.

This standardization not only reduces the process of publishing biodiversity information but also facilitates the investigation, evaluation and comparison of data to find solutions [¹¹]. Aquatic macroinvertebrates are large invertebrates measuring more than 500 μm. In freshwater bodies, insects are the most diverse group of aquatic invertebrates, with aquatic immature stages and terrestrial adults. Mayflies, stoneflies and odonates are some of the most common and widely distributed aquatic insects [¹²].

Software is crucial to carry out various academic, research and industrial operations, and new applications are developed to respond to the needs or opportunities of communities. Software design, development, and implementation processes are essential in business systems [¹³]. Agile methodologies for software development provide a set of techniques to be applied in stages in short work cycles, with the aim of making the project delivery process more efficient, without having to wait until the end [¹⁴]. Agile methods favor incremental development, generating new releases every two or three weeks, which are available to customers to get feedback according to requirements quickly [¹³].

The agile methodology focuses on continuous improvement through planning, creation, verification and constant improvement, with rapid deliveries to avoid dispersion and focus on an entrusted task [¹⁵]. Scrum is an agile software development methodology based on an iterative and incremental approach, made up of sprints that can last between one and four weeks. At the beginning of each sprint, a meeting is held to define the work to be developed, followed by daily reviews to verify the progress in the development of the sprint and at the end of it, another meeting is held to evaluate the results [¹⁶].

A Scrum is divided into three phases: planning, sprint cycle development, and project closure. During planning, the objectives of the project and the software architecture are established, the sprint cycles generate the system increments, and in the closing phase, the complete functionality is delivered together with the documentation [¹³]. For the development of the project, various tools were used, such as PostgreSQL for database management and programming languages such as Javascript, Typescript and CSS for designing user interfaces in a web environment. In the field of biological collections, there are insufficient or expensive applications to protect the information of the specimens.

For example, Specify requires knowledge of databases, programming languages, and markup languages. It is challenging to maintain trained personnel for this purpose; thus, interdisciplinary collaboration is important to develop more economical and efficient solutions that respond to the needs of the collections.

II. METHODS

At UNISANGIL, the framework proposed by Dr. Roberto Hernández Sampieri is used to carry out research processes. The Faculty of Natural Sciences and Engineering follows this approach to define the problem, establish the relationship between variables and formulate the research question. In the same way, the objectives of the project and the methodology were defined, a schedule of activities was designed to monitor the execution of the project and its deliverables were established, which resulted in the development of the bioinformatics platform [¹⁷].

Consequently, a question was raised: How to make the UNISANGIL CBMUS aquatic macroinvertebrate biological reference collection known to the community? The agile Scrum methodology was selected for software development, and frequent meetings were held with the client to manage the application requirements and model them using UML diagrams [¹⁸]. The phases considered were: requirements management, backlog management, sprint planning and execution, and finally, inspection and iteration [¹⁹]. The actions carried out in each phase are detailed below.

A. Definition of the Application Requirements

In this stage, sources of information and applications available to manage biological collections such as Specify, Simbiota and Elysia were explored, and sessions were organized with biologists and researchers from the CBMUS macroinvertebrate collection to define the application requirements. The Darwin Core standard was chosen to design and create the information storage structure based on the advice and support of the Humboldt Institute. Finally, four functionalities were defined for the application: access control, user administration, information registry, and query module. Based on this definition, the requirements for each system module were collected and prioritized in templates. Work was also done on the definition of user stories that allowed better analysis of the system and later identification of each of the actions that users perform on the platform.

B. Backlog Management

In this stage, the list of functionalities and tasks to be carried out was generated, allowing to obtain the details of the actions to be implemented for the creation of each of the identified modules, together with the allocation of time and those responsible. From the definition of the four modules that make up the platform, it was decided to guide the development process for each functionality.

C. Sprint Planning Meeting

During the first iteration, the focus was on planning the project. The highest priority requirements for each module were selected, and the application development phase started. A list of actions necessary for the development of each module was prepared, and work was done on the requirements identified in the previous stage. In addition, the effort estimate was made, and times and responsibilities were assigned to continue with the next phase of the project.

D. Sprint Execution

The Scrum methodology was used, and the project was divided into four sprints. Frequent meetings were held to assess the team's progress and troubleshoot. We worked with the questions: What have I done since the last synchronization meeting? What am I going to do from now on? And what impediments do I have or will I have?

E. Inspection and Iteration

After completing each iteration of the project, the team met with the client to present the fulfilled requirements. The client evaluated and reviewed the delivery and approved the increases or indicated the necessary adjustments. Then, in a retrospective meeting, the team would evaluate their performance and apply what they learned in the next phase. The platform database was designed taking as a reference the Darwin Core standard used by SiB Colombia during the enlistment process of the different components [²⁰]. After some tests, it was decided to use a Dashboard type interface for the project, with the aim of providing a pleasant user experience. The first user interface designs were made using the Balsamiq Wireframes software [²¹], and tests were carried out to choose the best option.

III. RESULTS

As a result of the development of the project, the following was achieved.

A. Application Architecture

The architecture of the platform was defined to provide comfort, friendliness and security features to the user and to host the information transparently and independently from the administration staff. It was decided to work with a web server client architecture that would allow hosting and publishing the information, and allowing the administrator user to upload the data (Figure 1).

Fig. 1 Scheme of the application architecture.

B. Diagrams and Models

UML diagrams (use cases and sequence) were used to represent user requests and the interaction of system components. The information storage structure was designed using the PostgreSQL database management system to allow user management, data hosting and massive uploads of information on the platform product of the collection of specimens, taking as reference the structure of the Darwin Core standard used by the SiB Colombia and the Humboldt Institute.

C. Platform Modules

The four modules proposed for the BiolCol platform were built: user access control, user administration, information registration and consultation module. The platform includes images of the specimens collected in different places in the Department of Santander.

The information registration functionality for the BiolCol technological platform was developed through massive data uploads, allowing the download of a template in Excel format for data registration and later uploading the file to the system database.

Once the data is loaded, the list of macroinvertebrates in the form can be seen, and by clicking on the icon next to the specimen code, more details about it are displayed, as can be seen in the Figure 2.

Fig. 2 Query of information registered in the system for each specimen.

The platform allows visualizing the place where the specimen was collected, as can be seen in the Figure 3.

Fig. 3 Geolocation view to locate the point where the macroinvertebrate was collected.

The system has the option to view the macroinvertebrates in 3D for some of the collected specimens. The different models and images are saved in a folder on the server and related to the database. Each model is related to its corresponding specimen and labels. An example of this display option can be seen in the Figure 4. The platform allows rotating the image 360 degrees to see the organism in more detail.

Fig. 4 3D Visualization of the macroinvertebrates of the CBMUS biological collection.

The fourth module of the BiolCol platform allows queries and generates labels to print and identify the organisms in the laboratory's physical collection. Researchers can search for codes associated with organisms and generate a PDF file with QR codes for printing. The QR code allows to consult the information registered on the platform from any device and send it to print. Figure 5 shows an example of the file generated to print and mark the bottles stored together with the organisms in a laboratory of the Institution.

Fig. 5 Report generation: macroinvertebrate labels in PDF format.

IV. DISCUSSION

The BiolCol platform was created to give visibility to the information of the UNISANGIL CBMUS biological collection and to host information from other biological collections that use the Darwin Core standard for specimen information management. The platform allows the upload of data through an Excel template, detailed visualization of the information, 3D visualization of some specimens and generation of labels to identify the organisms in the physical collection.

Biological Collection is a bioinformatics tool that allows the registration, storage and visualization of information from the specimens of the CBMUS-UNISANGIL biological collection. The platform is accessible through the web. The registration is intuitive, friendly, agile and graphic. Users have different roles in the system and can easily access the application.

The definition of user stories was used to manage the requirements of the BiolCol application, which allowed detailed visualization of the behavior of the final product before the users. The client provided feedback during the testing phase, which led to the final approval of the software development and the successful fulfillment of the project's expectations.

The new web application allowed the digitization of specimens collected in the field more quickly, which reduced the time spent on systematizing data from the biological collection that researchers frequently update.

On many occasions, the information is not used due to the lack of technological tools and knowledge for data management. For this reason, automation and the correct use of data are necessary to generate knowledge.

ACKNOWLEDGEMENTS

The authors thank the biologist Frank Carlos Vargas Tangua, director and teacher of the GEASID and the Magister Sandra Liliana Quintero Díaz, a researcher from the same group, for the accompaniment throughout the project that allowed the construction of this technological tool.

REFERENCES

[1] G. Didier, C. M. Villa, “Informe de gestión institucional año 2021,” Instituto de Investigación de Recursos Biológicos Alexander von Humboldt. Bogotá D. C., 2022. [ Links ]

[2] E. Arbeláez Cortés, “Describiendo especies: un panorama de la biodiversidad colombiana en el ámbito mundial,” Acta Biológica Colombiana, vol. 18, no. 1, pp. 165-178, 2013. [ Links ]

[3] J. L. Villaseñor, “¿La crisis de la biodiversidad es la crisis de la taxonomía?,” Botanical Sciences, vol. 93, no. 1, pp. 03-14, 2015. [ Links ]

[4] J. V. Crisci, “Espejos de nuestra época: biodiversidad, sistemática y educación. Gayana,” Botánica, vol. 63, no. 1, pp. 106-114, 2006. [ Links ]

[5] M. Coronado Zamora, ¿Qué es la bioinformática?, 2013. Bioinformática al alcance. http://bioinformatica.uab.cat/genetica_tfg/bioinformaticaabast/Qu%C3%A9_es.html [ Links ]

[6] A. Soto, Digitalización y Registro 2D y 3D, 2014. https://prezi.com/udyedqsg_1h7/digitalizacion-y-registro-2d-y-3d/ [ Links ]

[7] D. Murcia, A. Guillen, S. Martínez, Propuesta para la implementación del proceso de digitalización documental certificada para la empresa RTVC sistemas de medios públicos en el proceso de proveedores, 2018. https://repository.ucatolica.edu.co/bitstream/10983/16000/trabajo%20de%20%sintesis%20aplicada%20fase%20final.pdf [ Links ]

[8] K. M. Moreira Banes, Implementación del módulo de digitalización de documentos para la carrera de Ingeniería en Sistemas Computacionales, Universidad de Guayaquil, Ecuador, 2011. http://repositorio.ug.edu.ec/bitstream/redug/6742/1/Tesis%20Completa%20-344-2011.pdf [ Links ]

[9] Instituto von Humboldt, Las colecciones biológicas, ¡fundamentales para la conservación de la biodiversidad!, 2019. http://www.humboldt.org.co/es/noticias/actualidad/item/999-colecciones-conservacion-biodiversidad#:~:text=Las%20colecciones%20biol%C3%B3gicas%20son%20repositorios,la%20memoria%20de%20los%20ecosistemas [ Links ]

[10] Instituto von Humboldt, Registro Nacional de Colecciones Biológicas (RNC), 2020. http://www.humboldt.org.co/es/servicios/servicios-y-recursos/registro-unico-nacional-de-colecciones-biologicas-rnc [ Links ]

[11] Global Biodiversity Information Facility, ¿Qué es Darwin Core y por qué es importante?, 2023. https://gbif.org/es/darwin-core [ Links ]

[12] R. Ladrera, “Los macroinvertebrados acuáticos como indicadores del estado ecológico de los ríos,” Páginas de Información Ambiental, vol. 39, pp. 24-29, 2012. [ Links ]

[13] I. Sommerville, Ingeniería de Software, 9ª Edición. Pearson Educación, México, 2011, pp. 56-73. [ Links ]

[14] Zendesk. ¿Qué es la metodología ágil y cuáles son las más utilizadas?, 2022. https://www.zendesk.com.mx/blog/metodologia-agil-que-es/ [ Links ]

[15] M. Tena, ¿Qué es la metodología ágil?, 2018. https://www.bbva.com/es/metodologia-agile-la-revolucion-las-formas-trabajo/ [ Links ]

[16] J. Palacio, C. Ruata, Scrum Manager. Gestión de Proyectos. Revisión 1.4.0. Safe Creative, 2011. https://topodata.com/wp-content/uploads/2019/10/210617918-Gestion-de-Proyectos-Juan-Palacios.pdf [ Links ]

[17] R. Hernández Sampieri, C. Fernández Collado, P. Baptista Lucio, Metodología de la investigación, 6a. Edición, McGraw-Hill, México D.F., 2014. [ Links ]

[18] J. Barrera Suárez, D. Morales González, O. Quecho Zambrano, Diseño e integración del módulo “Análisis de Datos” para los resultados obtenidos en las pruebas Saber 11 y Pro en UNISANGIL empleando inteligencia de negocios, Grade Thesis, Fundación Universitaria de San Gil, Colombia, 2011. [ Links ]

[19] Equipo IDA, Metodología Scrum en proyectos digitales, 2017. https://blog.ida.cl/estrategia-digital/metodologia-scrum-en-proyectos-digitales/ [ Links ]

[20] SiB Colombia, SiB Colombia,2022. https://biodiversidad.co/ [ Links ]

[21] Balsamiq Wireframes, Wireframe tool, 2008. https://balsamiq.com/wireframes/ [ Links ]

Citation: L.-Y. Caicedo-Chacón, H.-J. Barón-González, J.-C. Sánchez-Pedroza, C.-A. Aguilar-Carreño, K.-S. Vega-Otálora, K.-G. Borja-Acosta, “BiolCol: Technological Platform for the Dissemination of the Biological Reference Collection of Aquatic Macroinvertebrates,” Revista Facultad de Ingeniería, vol. 32, no. 64, e15525, 2023. https://doi.org/10.19053/01211129.v32.n64.2023.15525

AUTHORS’ CONTRIBUTION

Luz-Yamile Caicedo-Chacón: project administration, writing and validation.

Henry-Javier Barón-González: writing, supervision and validation,

Juan-Carlos Sánchez-Pedroza, Carlos-Alberto Aguilar-Carreño, Kevin Santiago Vega Otálora: software, methodology and visualization,

Kevin-Giancarlo Borja-Acosta: writing and supervision.

Received: January 20, 2023; Accepted: April 22, 2023; Published: May 06, 2023

This is an open-access article distributed under the terms of the Creative Commons Attribution License