Modeling in Superstatistics of the Size Distribution of Coding DNA Sequences for the Eukaryotic Genome.
DNA, Non-additive Statistics, Superstatistics, Bayesian Inference, Time Series.
This study investigates the core ideas of superstatistical formalism and applies these notions to analyze genomic data
obtained from \textit{eukaryotic} species. In order to account for short-range correlations in the size distributions of the coding
section (exons) of the \textit{Homo sapiens} genome, we use an extended Heston model within the framework of
superstatistics. In our investigation of plants, we use a time series analysis methodology to accurately depict the connections
between different time points within the \textit{Cucurbitaceae} family.
The $q$-Gamma and Inverse $q$-Gamma distributions were shown to be helpful in accurately characterizing the genomic
data. In addition, the use of Bayesian analysis enabled us to evaluate the level of uncertainty linked to the model parameters
and choose the most suitable model for describing the data. The findings of our study suggest that using superstatistical
approaches and utilizing generalized distributions are effective techniques in the analysis of genomic data. These methods
provide a more comprehensive insight into the short-range correlations seen in exon size distributions.