***************************************************************************

Credit:
-------

When making use of the data inside this archive, please cite:

Bron et al., 2020, A&A, doi:10.1051/0004-6361/202038040

***************************************************************************

Archive URL:
------------

The original version of this archive is available at

    https://www.iram.fr/~pety/ORION-B/data/orionb-2020-bron.tar.xz

***************************************************************************

Contents:
---------

This archive contains :
- the chemical model grid results used in the article (subfolder "Model_grids").
- the results obtained in the paper for the two physical condition regimes 
(translucent gas and dense cold medium) and based on column density ratios or 
line intensity ratios (subfolders "Results_dense_cold_colden", 
"Results_dense_cold_intensities", "Results_translucent_colden" and 
"Results_translucent_intensities").
- the Autorank pipeline implementing the method described in the article 
(subfolder AutoRank).

Model grid results:
-------------------
One file is provided for each grid (translucent medium and cold dense medium), 
containing one line per chemical model.
For each model, we provide the physical conditions, the ionization fraction 
computed by the model, the column densities computed by the model for the tracer 
selected in the article, the integrated line intensity computed with RADEX based 
on the model results.

Data tables from the paper:
---------------------------

* Tables B1 to B4:

The data of tables B1 to B4 in the article are provided here as ASCII files.
They can be found at 
"Results_{model_grid}_{quantities}/ranking_table_{model_grid}_{quantities}.dat"
where :
- {model_grid} specifies the model grid : "translucent" for the translucent 
medium model grid, or "dense_cold" for the cold dense medium model grid.
- {quantities} specifies the quantities used as observables : "colden" for 
column densities, or "intensities" for integrated line intensities.
Compared to the tables of the article, we have added two columns (labeled as 
"[6]Validity_bound_low(log10)" and "[7]Validity_bound_high(log10)"). 
These two columns give the minimal value and maximal values found in the chemical 
model grid for each observable ratio. 
The Random Forest models provided here (see below) should not be used outside of 
these bounds. 
Observations that fall outside of these bounds are incompatible with the chemical 
model grid used to train the Random Forest model.

* Tables B5 to B8:

The data of tables B1 to B4 in the article are provided here as ASCII files.
They can be found at 
"Results_{model_grid}_{quantities}/{quantities}_fit_coeffs_full_{model_grid}.dat
See the article to find the fit formulae in which the fit coefficients provided 
in these tables should be used.

Trained Random Forest models:
-----------------------------
In each of the four cases, we provide the trained Random Forest models for the 
best 10 ratios.
For each Random Forest model, a .zip archive is provided with name : 
Results_{model_grid}_{quantities}/RF_models_{system}/ratio{i}_RFmodel_{tracer1}_over_{tracer2}.zip
where {system} specifies the operating system of the machine on which you 
intend to reuse the RF model, where {i} is the rank of the tracer 
(1 is the best, 10 is the tenth best), and {tracer1} and {tracer2} are the 
name of the two observable quantities (column densities or line intensities) 
that compose the ratio.

To reuse the RF models, first copy this archive to the machine where you intend 
to reuse the RF model, then unzip the archive and compile its content by running 
the command "make"".
This produces a shared library file (extension ".dylib", ".so" or ".dll" 
depending on the type of system you selected).
To load and use these models from a python script, you can adapt the following 
code :

> import treelite_runtime
> model = treelite_runtime.Predictor({filename}, verbose=True)
> predictions = model.predict(treelite_runtime.Batch.from_npy2d({your_data_points}))

where {filename} should be the compiled library file corresponding to the RF model 
you want to load (extension ".dylib", ".so" or ".dll"), and {your_data_points} 
the observed values (as a 2D numpy array) of the ratio from which you want to 
predict the target quantity. 
The provided input values should be log10 of the values, and the predictions will 
be in log10 of the target variable. 
These compiled library files can also be used from C code. 
See the documentation here :
https://treelite.readthedocs.io/en/latest/tutorials/deploy.html#option-2-deploy-prediciton-code-only

***************************************************************************

References:
-----------

[4] Bron et al., 2018, A&A, doi:10.1051/0004-6361/201731833
[3] Orkisz et al., 2017, A&A, doi:10.1051/0004-6361/201629220
[2] Gratier et al., 2017, A&A, doi:10.1051/0004-6361/201629847
[1] Pety et al. 2017, A&A doi:10.1051/0004-6361/201629862

***************************************************************************