3 posts tagged with "gsoc"

Google Summer of Code 2022 - MLOps for Reproducible Science

September 11, 2022 · 16 min read

GSoC'22 Mentee

Some technical backgouround

Full Lands Integration Tool (FLINT) is an open-source software technology designed for measurement-reporting-verifying greenhouse gas emissions and removals from forestry, agriculture and other land uses. FLINT is not an MRV system but provides a framework to progressively build MRV systems for specific cases.

The Generic Carbon Budget Model (GCBM) is a tool developed to assess and report the cumulative effects of anthropogenic and natural disturbances on forests. The GCBM is a set of modules developed by the Canadian Forest Service (CFS) to run on top of FLINT. This set of modules describes several forest carbon pools. GCBM simulations work at an annual time step interval. GCBM handles as inputs a combination of spatially explicit datasets concerning the forest (with information like tree species or the location of forest types), age, climate, and disturbances, along with non-spatial parameters such as volume to biomass conversion coefficients or yield curves. A python tool is used to prepare the spatial inputs from raster to vector format and an SQLite Database handles the non-spatial data [1].

_{*carbon pool: represents a reservoir of carbon that can be stored or released , **disturbances: unplanned events (e.g. wildfire) that affect carbon pools}

Project Description 📌

We applied MLOps techniques and tools to a complex scientific workflow s part of a community led, distributed carbon modelling platform. These techniques ensure reproducibility, which is the cornerstone of good science.

In my proposal, the main concept is to automate complicated and technically demanding tasks, as well as the corresponding reports, that a moja-global contributor might face by using CI/CD tools for Machine Learning/Data Science projects. I propose a DVC repository (i.e.remote storage) to cache significant logs and outputs of simulations which is achieved with the DVC pipeline for GCBM.Belize and Colombia repositories. Furthermore, I build a cloud storage repository on which we can use DVC's and CML's features to track, cache, compare, store and make flint-ready datasets there, thus making the datasets reproducible and interactive with the git repository that processes them because spatial datasets have a lot of variability and a standardisation process is needed to achieve reproducibility. I believe I achieved that with the work I've done on the Land Sector repository. Last but not least, I created a CML Action that generates a small summary/report from the execution of a simulation for the FLINTcloud platform as a numerical integration test, preventing breaking changes while under active development, and providing a useful template for new users to deploy their own FLINT services.

_{** flint-ready: convert spatial datasets in into to a common format and coordinate system}

Tech Stacks Used

Purpose	Tools and Technologies used
CML Action in FLINT.Cloud	CML, GitHubActions, Docker
DVC pipeline in GCBM.Belize	DVC, Python, R
Land Sector Dataset Processor	DVC, CML, geopandas, gdal, GitHubActions

Mentors

I owe huge thanks to my mentors Andrew O'Reilly-Nugent and Simple Shell for supporting me throughout the whole program and providing the appropriate feedback to me so I can improve my contributions.

Fellow contributors

Also huge thanks to community members Namya, Padmaja, Harsh, and all the other contributors that I worked with for not only providing support on technical issues but welcoming me to the community.

What do I believe I achieved in a general aspect?

In my proposal, my ideas and deliverable items were described generically and indefinitely because I hadn't realized 100% how several aspects of the code base worked. After I conducted enough research and learned how the code works I believed I reformed my deliverables more realistically.

DVC pipelines on the GCBM module

GCBM.Belize PR link: https://github.com/moja-global/GCBM.Belize/pull/14

GCBM.Belize repository

GCBM.Belize was developed as a case study for the GCBM in Belize. The repository can be seen as a paradigm on how the GCBM works in nation-scale, but also provides a technical paradigm for new-coming contributors on how the GCBM works, particularly:

How the datasets are prepared
How the simulation runs
How to interact with the results (analysis and reports)

Some existing issues:

The whole workflow was configured to run in Windows Batch scripts and the steps were not connected between them, nor do they follow any structure, making their execution and the analysis of the results more complicated.
The Belize repository is a good case for experimenting with the GCBM by modifying specific parameters of the simulation and comparing the different outputs they provide. There was no such procedure to do that.

A little about DVC

Data Version Control (DVC) is a dataset versioning tool that takes advantage of already existing engineering tools like Git. So we can't say that DVC is git for handling large-sized datasets that can be managed by git due to size limits.

What was the goal?

Integrate a system-agnostic pipeline to execute the complete workflow (preprocessing, simulation and postprocessing) that tracks and stores the outputs in remote storage.

What does the pipeline offer to the user?

The cases of GCBM.Belize and GCBM.Colombia are developed to only be executed in Windows systems but using DVC's functionalities I set up the pipeline to be system-agnostic.
Before the DVC pipeline was established, the phases of the workflow had to be executed manually but using DVC anyone can execute the whole workflow with only one command (dvc repro)
In the cases of the GCBM module, the outputs it generates are for the most part .tif files and in the postprocessing step, it generates some metrics and plots. All these outputs are listed and tracked by DVC using md5 hashes. I used DVC's functionality to store files in remote storage and set up a Google Drive repository that stores them.
Furthermore, I used DVC's features for metrics and plot files to track these kinds of outputs from the workflow. This way when someone created another version of the GCBM, the dvc diff command could be used to compare the metrics from the standard GCBM version to the new one (and use the output in a potential report as well).
In the scenario that someone has created and executed different versions of the established pipeline the dvc exp tool can be used to list, compare and display metrics, plots, and the output tifs (and use these outputs in a potential report as well).

How does the DVC pipeline works?

Pipelines in DVC are developed in YAML syntax and are divided into stages. For each stage to be appropriately executed we need to define:

the working directory
the command to be executed
the stage's dependencies (i.e. the files that are used/affected by the stage)
the stage's outputs(optional)

The GCBM pipeline consists of 12 stages :

Tiler → Processes and defines the spatial layers to be used in the sim
Recliner2GCBM → generates the gcbm_input database
modify_<type>_parameters scripts that apply preprocessing to the input_database to fit better into specific GCBM cases
UpdateGCBMConfiguration → updates the simulation configuration based on the contents of the Standalone_GCBM/template directory
run_gcbm
create_tiffs → generates the compiled spatial output
compile_results → generates the output database, a more user-friendly format to show the output results
post_processing → which creates the 3 figures that showcase the distribution of the four unique indicators throughout the simulation and metrics files that calculate the mean values of 3 different metrics of carbon stock divided between 3 periods (1 period = 50 years) for every type of indicator and every type of LifeZone.

Another important fact is that DVC by default does not assign any specific order to the execution of the pipeline. Although someone can define a specific order by denoting the outputs of each i-th stage as dependencies in i+1-th stage.

The pipeline offers the ability to professionals outside the programming world to effortlessly interact with simulations and modules without having to dig down on coding or other difficult technical tasks so they can focus more on analyzing the results.

How it can be utilized in the future?

DVC not only provides a blueprint to organize the workflow but also offers the ability to determine the dependencies and outputs of each step of the workflow. Because in most of the cases that are studied and use either FLINT.core or other modules the stages of the workflow are somewhat familiar(create a simulation and its configs, run it, compile the results and run some postprocessing scripts) this pipeline can be used as a blueprint/guide (with some modifications) to integrate the same functionalities in other FLINT or GCBM cases.

Usage examples

You can list the stages and each stage's dependencies by typing:

$ dvc stage list
tiler                       Outputs ..\..\logs\tiler_log.txt
recliner2gcbm_x64           Outputs logs\recliner_log.txt
add_species_vol_to_bio      Outputs logs\add_species_vol_to_bio.log
modify_root_parameters      Outputs logs\modify_root_parameters.log
modify_decay_parameters     Outputs logs\modify_decay_parameters.log
modify_turnover_parameters  Outputs logs\modify_turnover_parameters.log
modify_spinup_parameters    Outputs logs\modify_spinup_parameters.log
update_GCBM_configuration   Outputs ..\logs\update_gcbm_config.log
run_gcbm                    Outputs ..\logs\Moja_Debug.log
create_tiffs                Outputs ..\..\logs\create_tiffs.log, ..\..\processed_output\spatial
compile_results             Outputs ..\..\logs\compile_results.log
post_processing             Reports metrics\1900-1950_Deadwood_Tropical_Dry.json, metrics\1900-1950_Deadwoo…

You can display the metrics by typing:

$ dvc metrics show
Path                                                          area_sum_mean    pool_tc_per_ha_mean    pool_tc_sum_mean
metrics\1900-1950_Deadwood_Tropical_Dry.json                  1142790.3823     8.46574                9674571.56838   
metrics\1900-1950_Deadwood_Tropical_Moist.json                608498.40961     19.67003               11969184.86104  
metrics\1900-1950_Deadwood_Tropical_Premontane_Wet.json       417245.76511     23.20835               9683586.49465   
metrics\1900-1950_Litter_Tropical_Dry.json                    1142790.3823     7.63421                8724307.19108   
metrics\1900-1950_Litter_Tropical_Moist.json                  608498.40961     15.89514               9672169.16877   
metrics\1900-1950_Litter_Tropical_Premontane_Wet.json         417245.76511     19.54048               8153183.07648   
metrics\1900-1950_Soil Carbon_Tropical_Dry.json               1142790.3823     18.18886               20786056.03202  
metrics\1900-1950_Soil Carbon_Tropical_Moist.json             608498.40961     69.2994                42168571.67542  
metrics\1900-1950_Soil Carbon_Tropical_Premontane_Wet.json    417245.76511     73.05651               30482518.25375    
...

You can test the pipeline by typing:

$ dvc repro # or 
$ dvc exp run

You can push the outputs the pipeline generated by typing:

dvc push

For testing purposes, I used a personal Google Drive storage to upload the outputs here
Here you can see a video showcase for the whole execution of the pipeline only with one command
There is a guide/description that explains how to set up, configure and run DVC and the pipeline with more technical details here

Processing the Land Sector Datasets

PR Link: https://github.com/radistoubalidis/Land_Sector_Datasets/pull/3

The Land Sector Datasets repository

The Land Sector Datasets repository consists of datasets and their metadata for Land Sector management use in the FLINT. In more detail, this repository includes Jupyter Notebooks that contain the license, metadata along with other information, and the processing code to get the datasets (which are in raster format) into vector format (.tif or geoJSON) to be flint-ready.

_{** flint-ready: prepare the dataset in vector format}

What's the issue?

There is also a Google Drive remote storage maintained by moja-global's contributors, where the processed (vector format) datasets are stored. Although the Land Sector git repository and the Google Drive storage are not directly connected, meaning there's no specific procedure that ensures the reproducibility of those datasets, contributors can only use one version of each one.

What's the goal?

The goal here is to establish a framework that guarantees reproducibility while curating the datasets and also decrease the required dependencies to process the datasets, so in general, make the Google Drive storage directly connected and interactive with the GitHub repository.

What was my approach?

My first thoughts were that this case is very well suited for utilizing DVC's features to track datasets through Google Drive but that didn't seem enough. After discussion with the community and analyzing the situation I pinned down the issue into these finite tasks:

Implementing a general processor capable of processing the datasets into flint-ready formats
Track the processing steps (using DVC) and store the flint-ready datasets in the Google Drive storage
Provide a health check on the processed datasets when someone makes changes to the processing steps

Since the repository consists of many different datasets my mentors suggested I use the Harmonized World Soil Database dataset as a use case. My first task was to implement processing code for the dataset from raster into vector (.tif) format. I achieved that by writing a script that utilized the gdal python library to convert the raster dataset into vector format and restructure it to be flint-ready. Then, I created a DVC pipeline that runs the processing script and lists as output the vector dataset. At this point, I have to note that the pipeline is essential for future reference where we potentially need to implement processing for multiple datasets and push them in remote storage only with one command. The third and final step was to create a GitHub action that executes the pipeline and pushes the output datasets to Google Drive storage. This action is going to be triggered only when someone’s commit changed the dataset processing code thus providing a health-check (i.e. that someone's changes in the processing code generated a reproducible and flint-ready dataset). For a better understanding of how these deliverables will work here is a workflow example of a potential contributor:

Let's say I want to use the Harmonized World Soil Database dataset, which is stored in GeoTiff format in moja-global's Google Drive storage, but I want to use it in GeoJSON format, so I make some changes to the processing script and commit them.
The health-check action is triggered and executes the pipeline
After the action is completed the processed dataset is pushed to the Google Drive storage and an auto-generated commend (using CML) is published (example here) that informs me whether my script generated a flint-ready dataset or not.

Notes

Most of the datasets in the repository are processed using arcpy which is a python package for geographical analysis but can only be used in Windows systems. As suggested by my mentors I also worked on refactoring the processing code of some Datasets that use arcpy with system-agnostic libraries/modules such as geopandas. I applied this idea in the Global Ecological Zones dataset where I processed the dataset with geopandas instead of arcpy, so now the datasets can be processed in non-Windows systems too.
I believe that if we follow an OOP approach we can implement a general processor that would be able to handle multiple Datasets from the Land Sector repository so it can be used in other moja-global projects in the future.
As this deliverable is not merged yet I used a personal Google Drive storage for testing my code but the same principles can be applied to moja-global's Google Drive storage.

CML Action on FLINT.Cloud

PR Link: https://github.com/moja-global/FLINT.Cloud/pull/132

FLINT.Cloud Repository

The core goal in FLINT.Cloud is to build a continuous deployment pipeline to offer FLINT on cloud resources. It consists of 2 unique APIs that run a different kinds of simulations. The APIs in FLINT.Cloud are configured inside docker containers where the required dependencies such as the FLINT source code, other required software packages, etc. are included. The repository is going to co-operate with other repositories because the community is also working on FLINT.UI which is going to be a FLINT frontend client for configuring simulations using the FLINT.Cloud APIs. Regarding the FLINT.Cloud repository, every step of the workflow (creating simulation configs, running and analyzing results) is executed through manual requests to the APIs. So, after discussion with the community and mentors, the creation of a CI script that would run a simulation and auto-generate a report was suggested. This CI script would provide a blueprint on how the APIs work to new-coming contributors and also stand as a procedure that automates a part of the workflow, thus releasing the developer/researcher of some technically demanding tasks.

A little about CML

CML is an open-source CI/CD tool for Machine Learning or general Data Science projects. It can be used to track and provide auto-generated reports on development workflows and also can be configured to provide CI pipelines on cloud-hosted runners.

What was the original goal?

The original goal was to establish a GitHub Action that executes a benchmark simulation in FLINT.Cloud and provides an automated report on the sim results using CML

How does it work?

First of all, the action is triggered on pull requests with the simulation label. The action uses a benchmark sim configuration and runs it using the rest_gcbm_api which is wrapped inside a docker container that includes all the required dependencies for the simulation along with the GCBM rest-API. Using the provided benchmark configuration we execute it. After the simulation ends we use the CompileResults repo to prepare the compiled_gcbm_output database where SQL queries are run to provide information on the simulation's output. These inferences and plots can be utilized by CML to be published in the form of a comment on the PR.

What does the Action achieve?

Auto-generated simulation report when someone raises a PR with the simulation label. The generated report currently looks like this
Establishes a validation process that the changes made in the specific PR will not break anything regarding the simulation run.

How and why we had to modify the Action

The FLINT.Cloud repository is still in the development phase. Since the GCBM rest-API is still in the development phase and daily continuous changes in the input configuration code are made, we stumbled upon some errors that couldn't be resolved so it was decided to temporarily modify the CML Action to run the simulation straight from the FLINT CLI interface (which is included in the container) until the APIs get their final form.

How it can be enriched?

After the CML Action PR was merged mentors suggested we could enrich the auto-generated report by also displaying (in Jupyter Notebook form) the code that generated any potential plots. They suggested the use of jupytext, which is a python package for versioning and managing Jupyter Notebooks. It offers a variety of commands that map python scripts, markdown texts, and notebooks, so you can instantly access code in any of these formats. To sum up, I used jupytext to convert the script that generates the report from py:percent format into markdown so it can be attached to the report. I also raised a new PR for this addition.

Final Thoughts

My experience throughout the mentorship has been wonderful. I believe I learned a variety of things, from how to implement CI principles in non-traditional web projects to how to write cooperative code. I feel proud for being a part of a community with such team spirit, and I want to continue collaborating and making real contributions to moja-global.

References

[1] Shaw, C. H., et al. "Cumulative effects of natural and anthropogenic disturbances on the forest carbon balance in the oil sands region of Alberta, Canada; a pilot study (1985–2012)." Carbon Balance and Management 16.1 (2021): 1-18.

GSoC: GCBM Simulation Editor - Status Report

September 9, 2022 · 4 min read

Yash Kandalkar

GSoC'22 Intern

Hello everyone! Hope you're doing well! This is the final blog of my GSoC journey. It'll contain a brief of everything that I have worked on during my term and what's left to do.

Mentors

A huge thanks to Andrew O'Reilly-Nugent and Harsh Mishra for helping me at every step of the project. I cannot thank them enough for encouraging me throughout the project and guiding me in the right direction.

Community Members

Special thanks to all these community members for helping me throughout the project: Padmaja, Namya, Shloka, Sanjay, Janvi, and Palak. I wouldn't have been able to finish the project without their help.

Community Bonding Period

I started working on the project by researching on the different inputs required for running the GCBM Simulation. I gathered information from Andrew, Padmaja and Namya regarding the configs which can be edited by the users on the web interface. During this period, we also decided to migrate the project from Vue 2 to Vue 3 as it is the latest version of Vue and supports more libraries. I worked on the migration process and updated all the old libraries to their Vue 3 compatible versions and updated some legacy code. I also started working on the new components and removing the old ones in the GCBM Simulation Editor.

Local Domain Configuration UI

Relevant PRs:

Week 1:

From the last week of community bonding period I started creating components of each configuration parameter. In this week, I created the UI for the Modules configuration. The users can enable and disable (add or remove) modules that'll be included in the run. Some modules, like the Decay Module, also contains configurable variables.

Modules

Relevant PRs:

Week 2:

In this week I worked on creating the Pools config. Users can edit different pool values and search the name of the pools in a search bar.

Pools

Search bar Pools

Relevant PRs: feat: add Pools config in GCBM run #325

Week 3:

In this week, I added a Create Simulation page which will be the entry point of the simulation run and added sub-menus in the Upload Section for different file types (disturbances, classifiers, input DB and miscellaneous).

Create Simulation UI

Relevant PRs:

chore: redid the file and menu structure #331

Week 4:

I started working on the UI for the Upload section. Here, the major work was of the JSON config editor. There was a requirement for a GUI editor to make it easier for non-developers to edit the JSON attributes easier. I used the vue3-json-editor library for this functionality.

Upload section

Relevant PRs: https://github.com/moja-global/FLINT-UI/pull/332

Week 5:

I added the UI for editing column names in the DB Editor. After the user uploads a database file, the server responds with the table and attribute names. The requirement was to provide users the option to edit table and attribute names on the frontend. I added a Edit button near the database table which when pressed, makes all the attribute names editable.

database editor

Relevant PRs:

feat: add UI for Spinup, Run simulation, DB Editor, Integrate upload endpoints #347

Week 6:

Connected some configuration like Local Domain with the Vuex store. I added vue-persist so that the configurations made on the frontend will persist between website reloads.

Relevant PRs:

feat: add UI for Import Simulation modal, connect LocalDomain to the vuex store #355

Week 7:

Added Import Simulation feature in the sidebar. Here, users can upload all the files required by the simulation. If configuration files are uploaded, they will be read using the FileReader API and the UI will be updated accordingly.

import simulation

Week 8:

Added the UI and functionality for changing table names in the db editor.

change table names

Week 9:

Worked on adding an Export Simulation feature so that the users can download the configurations in JSON format if they want to continue configuring the simulation later. These JSON configuration files can be uploaded in the Import Simulation feature, which will update the UI accordingly.

Week 10:

In this week, I worked on creating a Tour for the simulation, which will help new users understand the flow of the simulation editor.

Further Steps:

In the coming week, I will be writing tests for the components I created and documenting the features which will help new contributors understand the code and continue enhancing the project.

Previous Blogs:

Progress so far - 2022 GSoC mentees

August 13, 2022 · 5 min read

Namya LG

GSoD'22 Intern

Yash Kandalkar, Radis Toubalidis and Palak Sharma are GSoC mentees for the year 2022.

Yash is working on building the GCBM Simulation Editor with the motive of simplifying the process of running the GCBM simulation and allowing users to configure the inputs supplied. Running a simulation broadly consists of - creating a new simulation, uploading inputs, running the simulation and downloading the output on completion. FLINT UI is the frontend interface, while the backend is powered by FLINT.Cloud APIs. For each of the above mentioned steps, there are API endpoints. In the initial phase, Yash migrated all the libraries used in the FLINT UI project to their Vue 3 compatible versions. On completion he started developing the UI components for creating a new simulation and uploading input files. Currently, the inputs are - classifiers, input database and disturbances (not mandatory). Classifiers and Disturbances are supplied as tiff files, the input database is a SQL database. The JSON associated with the tiff files are generated at the backend. Yash worked on making UI components for configuring parameters like Local Domain, Modules, Pools, etc. present in the generated JSON files. The next step is to complete the UI development and API integration for the further stages of the simulation.

Radis is working on the project MLOps for reproducible science.The goal of this project is to streamline the workflow of data scientists on the FLINT.Cloud project by leveraging the combined potential of Data version control (DVC) and Continuous Machine Learning (CML). He is working on integrating a CML Action for FLINT.Cloud that runs the whole simulation workflow. The action runs whenever a new pull request is raised to the FLINT.Cloud repository with specific labels (i.e. run-simulation). The action helps to report if a particular simulation was complete. After a simulation ends, a python script retrieves the log files created from the moja.cli tool and uploads them on git as an artifact. A python script included in the Compile Results repository generates the output database, compiled_gcbm_output.db, on which SQL queries are run to provide information on the simulation's output. Along with that, plots and visualisations can be created. These inferences and plots will be published as a comment to the pull request. Radis is also building a DVC Pipeline for GCBM.Belize that divides each step of the workflow. In this case, DVC not only provides a blueprint to organize the workflow but also offers the ability to determine the dependencies and outputs of each step of the workflow.

The DVC pipeline for GCBM.Belize consists of 12 stages. In each stage, the following information is to be defined :

Command that is going to be run in a particular stage
Working directory from which the command will run
Dependencies of a stage (i.e. the files that are affected from the command )
Outputs of the command which can be any file that is created but also DVC gives us the ability to define outputs as metrics files (json format) or plots files.On every stage of the pipeline DVC tracks the dependencies and outputs using md5 hashes (dvc.lock). There is also an option to setup remote storage (e.g. Google Drive, AWS, etc) and store the outputs of each stage there.

The main stages are :

tiler → defines the spatial layers needed for the simulation
recliner2gcbm → creates the input database

(3,4,5,6,7) → add_species_vol_to_bio , modify_root_parameters, modify_decay_parameters, modify_turnover_parameters, modify_spinup_parameters, apply preprocessing to the input_database to fit better for the Belize case.

update_GCBM_configuration → updates the simulation configuration based on the contents of the /Standalone_GCBM/template
run_gcbm → runs the simulation
create_tiffs → generates the compiled spatial output
compile_results → generates the output database, a more user-friendly format to show the output results
post_processing → which creates the 3 figures that showcase the distribution of the four unique indicators throughout the simulation as well as with the different configurations (i.e. with the default parameters or the modified parameters. It also generates the metrics files which calculate the mean values of 3 different metrics of carbon stock divided in 3 periods (1 period = 50 years) for every type of indicator and every type of LifeZone.

Palak Sharma is working on the project Building UI Library for Moja Global with the motive of creating an intuitive, consistent, and easy-to-use interface that can help developers within the User-Interface working group and users to quickly accomplish their tasks. A centralized collection of components encompassing the color, branding of moja global, typography, spacing, buttons, modals and form which will help establish a unified and consistent design language to help contributors and users. A new repository has been created for the UI library. After creating prototypes of the UI library on Figma, Palak started working on the implementation. It was decided that pure CSS will be used for building the UI library. The existing codebase in the FLINT UI repository was migrated from Vue JS version 2 to version 3. She worked on adding the Storybook Setup in the official UI Library repository to demonstrate the components better and also to document the code for using the UI Library. To make it easy for new developers, the usage and functionalities of the UI library will be documented. Palak and other contributors have added fully customizable Dropdown, Alert, Button, Card, Datepicker, and Sponsors components successfully to the project and other components like Modal, Footer, Navbar, Toggle, Slider, and Accordion components are under review.

Some technical backgouround​

Project Description 📌​

Tech Stacks Used​

Mentors​

Fellow contributors​

What do I believe I achieved in a general aspect?​

DVC pipelines on the GCBM module​

GCBM.Belize repository​

A little about DVC​

What was the goal?​

What does the pipeline offer to the user?​

How does the DVC pipeline works?​

How it can be utilized in the future?​

Usage examples​

Processing the Land Sector Datasets​

The Land Sector Datasets repository​

What's the issue?​

What's the goal?​

What was my approach?​

Notes​

CML Action on FLINT.Cloud​

FLINT.Cloud Repository​

A little about CML​

What was the original goal?​

How does it work?​

What does the Action achieve?​

How and why we had to modify the Action​

How it can be enriched?​

Final Thoughts​

References​

Mentors​

Community Members​

Community Bonding Period​

Week 1:​

Week 2:​

Week 3:​

Week 4:​

Week 5:​

Week 6:​

Week 7:​

Week 8:​

Week 9:​

Week 10:​

Further Steps:​

Previous Blogs:​

Some technical backgouround

Project Description 📌

Tech Stacks Used

Mentors

Fellow contributors

What do I believe I achieved in a general aspect?

DVC pipelines on the GCBM module

GCBM.Belize repository

A little about DVC

What was the goal?

What does the pipeline offer to the user?

How does the DVC pipeline works?

How it can be utilized in the future?

Usage examples

Processing the Land Sector Datasets

The Land Sector Datasets repository

What's the issue?

What's the goal?

What was my approach?

Notes

CML Action on FLINT.Cloud

FLINT.Cloud Repository

A little about CML

What was the original goal?

How does it work?

What does the Action achieve?

How and why we had to modify the Action

How it can be enriched?

Final Thoughts

References

Mentors

Community Members

Community Bonding Period

Week 1:

Week 2:

Week 3:

Week 4:

Week 5:

Week 6:

Week 7:

Week 8:

Week 9:

Week 10:

Further Steps:

Previous Blogs: