============== PAM Statistics ============== Introduction ============ The PAM data structure can be used to generate many diversity statistics. By default, many of the statistics presented in Soberon and Cavner (2015) are generated as well as some phylogenetic diversity metrics if a tree is provided. Additionally, new statistics can be created by decorating functions with the appropriate statistic type and adding them to the stats instance. ---- Generating Default Statistics ============================= You can generate the base statistics without any special configuration of a PamStats instance. Simply supply the PAM and optionally a tree then request the category of statistics you would like to be returned. .. code-block:: python >>> stats = PamStats(pam, tree=my_tree) >>> site_stats = stats.calculate_site_statistics() >>> species_stats = stats.calculate_species_statistics() >>> diversity_stats = stats.calculate_diversity_statistics() ---- Generate Statistics and Write to GeoJSON File ============================================= If you wish to visualize the outputs of the statistics in a GIS software like QGIS, you can do so by writing the outputs as GeoJSON. .. code-block:: python >>> import json >>> from lmpy.matrix import Matrix >>> from lmpy.statistic.pam_stats import PamStats >>> from lmpy.spatial.geojsonify import geojsonify_matrix >>> # Load PAM from file >>> pam_filename = 'my_pam.lmm' >>> pam = Matrix.load(pam_filename) >>> stats = PamStats(pam) >>> # Create and write GeoJSON >>> matrix_geojson = geojsonify_matrix(pam, resolution=0.5, omit_values=[0]) >>> with open('matrix.geojson', mode='wt') as out_json: ... json.dump(matrix_geojson, out_json) ---- Adding New Statistics ===================== It is fairly simple at add new metrics to the computations. The first step is to identify which class of metrics the new metric belongs to, this will determine how the metric is called within the statistics package and what the expected output is. For example, if we wanted to add a metric that computed the sum of tip lengths for the species present at a site, we would need the tree of those species as input and we would produce a single value for each site. This maps to `TreeMetric` and so we can define our function with a tree as input and add the `TreeMetric` decorator. See: `register_metric <../autoapi/lmpy/statistics/pam_stats/index.html#lmpy.statistics.pam_stats.PamStats.register_metric>`_ .. code-block:: python @TreeMetric def sum_tip_lengths(tree): tip_length_sum = 0 for node in tree.nodes(): if node.is_leaf(): tip_length_sum += node.edge_length return tip_length_sum Then we can register this new metric with the statistics package and it will automatically be calculated at the appropriate time with the appropriate inputs. .. code-block:: python >>> stats = PamStats(pam, tree=tree) >>> stats.register_metric('Sum tip lengths', sum_tip_lengths) >>> site_stats = stats.calculate_site_statistics() ---- Covariance Matrix Metrics ========================= Covariance matrix metrics take a PAM as input and produce a site by site or species by species matrix of metric values. sigma_sites ----------- Matrix of covariance of composition of sites. :math:`\mathbf{\Sigma}_{sites}(j,k) = \frac{1}{S}\sum_{i=1}^{S}\delta_{j,l}\delta_{k,l} - \frac{\alpha_j\alpha_k}{S^2}` sigma_species ------------- Matrix of covariance of ranges of species. :math:`\mathbf{\Sigma}_{species}(h,i) = \frac{1}{N}\sum_{j=i}^{N}\delta_{i,j}\delta_{h,j} - \frac{\omega_i\omega_h}{N^2}` ---- Diversity Metrics ================= Diversity metrics take a PAM as input and produce a single metric value for the entire study. schluter_species_variance_ratio ------------------------------- Schluter species-ranges covariance. :math:`V_{species} = \frac{\bar{\psi}^* - N /\beta_W^2}{1/\beta_W - \bar{\varphi}^* / S}` schluter_site_variance_ratio ---------------------------- Schluter sites-composition covariance. :math:`V_{sites} = \frac{\bar{\varphi}^* - S /\beta_W^2}{1/\beta_W - \bar{\psi}^* / N}` num_sites --------- Num sites is the total number of sites in the study area that have any species present. num_species ----------- Num species is the total number of species in the study that are present at any site. whittaker --------- Whittaker's multaplicative beta diversity metric for a PAM. :math:`\beta_W = \frac{1}{\bar{\omega}^{*}}` lande ----- Lande is Lande's addative beta diversity metric for a PAM. :math:`\beta_A = S(1 - 1/\beta_W)` legendre -------- Legendre is Legendre's beta diversity metric for a PAM. :math:`\beta_L = SS(\mathbf{X}) = SN / \beta_W - \left (\sum_{j=1}^{S}\omega_j^2 \right ) / N` c_score ------- C-score is the Stone & Robers checkerboard score for the PAM. :math:`C = \frac{2}{S(S-1)}\left [ \sum_{i=1}^{N} \sum_{h