Bayesian inference - Top 30 Publications

Bayesian quantile regression-based partially linear mixed-effects joint models for longitudinal data with multiple features.

In longitudinal AIDS studies, it is of interest to investigate the relationship between HIV viral load and CD4 cell counts, as well as the complicated time effect. Most of common models to analyze such complex longitudinal data are based on mean-regression, which fails to provide efficient estimates due to outliers and/or heavy tails. Quantile regression-based partially linear mixed-effects models, a special case of semiparametric models enjoying benefits of both parametric and nonparametric models, have the flexibility to monitor the viral dynamics nonparametrically and detect the varying CD4 effects parametrically at different quantiles of viral load. Meanwhile, it is critical to consider various data features of repeated measurements, including left-censoring due to a limit of detection, covariate measurement error, and asymmetric distribution. In this research, we first establish a Bayesian joint models that accounts for all these data features simultaneously in the framework of quantile regression-based partially linear mixed-effects models. The proposed models are applied to analyze the Multicenter AIDS Cohort Study (MACS) data. Simulation studies are also conducted to assess the performance of the proposed methods under different scenarios.

Comparing two sequential Monte Carlo samplers for exact and approximate Bayesian inference on biological models.

Bayesian methods are advantageous for biological modelling studies due to their ability to quantify and characterize posterior variability in model parameters. When Bayesian methods cannot be applied, due either to non-determinism in the model or limitations on system observability, approximate Bayesian computation (ABC) methods can be used to similar effect, despite producing inflated estimates of the true posterior variance. Owing to generally differing application domains, there are few studies comparing Bayesian and ABC methods, and thus there is little understanding of the properties and magnitude of this uncertainty inflation. To address this problem, we present two popular strategies for ABC sampling that we have adapted to perform exact Bayesian inference, and compare them on several model problems. We find that one sampler was impractical for exact inference due to its sensitivity to a key normalizing constant, and additionally highlight sensitivities of both samplers to various algorithmic parameters and model conditions. We conclude with a study of the O'Hara-Rudy cardiac action potential model to quantify the uncertainty amplification resulting from employing ABC using a set of clinically relevant biomarkers. We hope that this work serves to guide the implementation and comparative assessment of Bayesian and ABC sampling techniques in biological models.

New Mitogenomes of Two Chinese Stag Beetles (Coleoptera, Lucanidae) and Their Implications for Systematics.

Although conspicuous and well-studied, stag beetles have been slow to join the genomic era. In this study, mitochondrial genomes of two stag beetles, Sinodendron yunnanense and Prosopocoilus confucius, are sequenced for the first time. Both of their genomes consisted of 13 protein-coding genes (PCGs), 22 transfer RNA genes (tRNAs), 2 ribosomal RNAs (rRNAs), and a control region. The mitogenome of S. yunnanense was 16,921 bp in length, and P. confucius was 16,951 bp. The location of the gene trnL(UUR), between the A + T-rich and control region in S. yunnanense, is the first observed in Lucanidae. In P. confucius, an unexpected noncoding region of 580 bp was discovered. Maximum likelihood and Bayesian inference on the 13 mitochondrial PCGs were used to infer the phylogenetic relationships among 12 representative stag beetles and three scarab beetles. The topology of the two phylogenetic trees was almost identical: S. yunnanense was recovered as the most basal Lucanid, and the genus Prosopocoilus was polyphyletic due to P. gracilis being recovered sister to the genera Dorcus and Hemisodorcus. The phylogenetic results, genetic distances and mitogenomic characteristics call into question the cohesion of the genus Prosopocoilus. The genetic resources and findings herein attempts to redress understudied systematics and mitogenomics of the stag beetles.

Insight into Central Asian flora from the Cenozoic Tianshan montane origin and radiation of Lagochilus (Lamiaceae).

The Tianshan Mountains play a significant role in the Central Asian flora and vegetation. Lagochilus has a distribution concentration in Tianshan Mountains and Central Asia. To investigate generic spatiotemporal evolution, we sampled most Lagochilus species and sequenced six cpDNA locations (rps16, psbA-trnH, matK, trnL-trnF, psbB-psbH, psbK-psbI). We employed BEAST Bayesian inference for dating, and S-DIVA, DEC, and BBM for ancestral area/biome reconstruction. Our results clearly show that the Tianshan Mountains, especially the western Ili-Kirghizia Tianshan, as well as Sunggar and Kaschgar, was the ancestral area. Ancestral biome was mainly in the montane steppe zone of valley and slope at altitudes of 1700-2700 m, and the montane desert zone of foothill and front-hill at 1000-1700 m. Here two sections Inermes and Lagochilus of the genus displayed "uphill" and "downhill" speciation process during middle and later Miocene. The origin and diversification of the genus were explained as coupled with the rapid uplift of the Tianshan Mountains starting in late Oligocene and early Miocene ca. 23.66~19.33 Ma, as well as with uplift of the Qinghai-Tibetan Plateau (QTP) and Central Asian aridification.

Inferring 'weak spots' in phylogenetic trees: application to mosasauroid nomenclature.

Mosasauroid squamates represented the apex predators within the Late Cretaceous marine and occasionally also freshwater ecosystems. Proper understanding of the origin of their ecological adaptations or paleobiogeographic dispersals requires adequate knowledge of their phylogeny. The studies assessing the position of mosasauroids on the squamate evolutionary tree and their origins have long given conflicting results. The phylogenetic relationships within Mosasauroidea, however, have experienced only little changes throughout the last decades. Considering the substantial improvements in the development of phylogenetic methodology that have undergone in recent years, resulting, among others, in numerous alterations in the phylogenetic hypotheses of other fossil amniotes, we test the robustness in our understanding of mosasauroid beginnings and their evolutionary history. We re-examined a data set that results from modifications assembled in the course of the last 20 years and performed multiple parsimony analyses and Bayesian tip-dating analysis. Following the inferred topologies and the 'weak spots' in the phylogeny of mosasauroids, we revise the nomenclature of the 'traditionally' recognized mosasauroid clades, to acknowledge the overall weakness among branches and the alternative topologies suggested previously, and discuss several factors that might have an impact on the differing phylogenetic hypotheses and their statistical support.

Modelling sequences and temporal networks with dynamic community structures.

In evolving complex systems such as air traffic and social organisations, collective effects emerge from their many components' dynamic interactions. While the dynamic interactions can be represented by temporal networks with nodes and links that change over time, they remain highly complex. It is therefore often necessary to use methods that extract the temporal networks' large-scale dynamic community structure. However, such methods are subject to overfitting or suffer from effects of arbitrary, a priori-imposed timescales, which should instead be extracted from data. Here we simultaneously address both problems and develop a principled data-driven method that determines relevant timescales and identifies patterns of dynamics that take place on networks, as well as shape the networks themselves. We base our method on an arbitrary-order Markov chain model with community structure, and develop a nonparametric Bayesian inference framework that identifies the simplest such model that can explain temporal interaction data.The description of temporal networks is usually simplified in terms of their dynamic community structures, whose identification however relies on a priori assumptions. Here the authors present a data-driven method that determines relevant timescales for the dynamics and uses it to identify communities.

Feature inference with uncertain categorization: Re-assessing Anderson's rational model.

A key function of categories is to help predictions about unobserved features of objects. At the same time, humans are often in situations where the categories of the objects they perceive are uncertain. In an influential paper, Anderson (Psychological Review, 98(3), 409-429, 1991) proposed a rational model for feature inferences with uncertain categorization. A crucial feature of this model is the conditional independence assumption-it assumes that the within category feature correlation is zero. In prior research, this model has been found to provide a poor fit to participants' inferences. This evidence is restricted to task environments inconsistent with the conditional independence assumption. Currently available evidence thus provides little information about how this model would fit participants' inferences in a setting with conditional independence. In four experiments based on a novel paradigm and one experiment based on an existing paradigm, we assess the performance of Anderson's model under conditional independence. We find that this model predicts participants' inferences better than competing models. One model assumes that inferences are based on just the most likely category. The second model is insensitive to categories but sensitive to overall feature correlation. The performance of Anderson's model is evidence that inferences were influenced not only by the more likely category but also by the other candidate category. Our findings suggest that a version of Anderson's model which relaxes the conditional independence assumption will likely perform well in environments characterized by within-category feature correlation.

Reconstructing promoter activity from Lux bioluminescent reporters.

The bacterial Lux system is used as a gene expression reporter. It is fast, sensitive and non-destructive, enabling high frequency measurements. Originally developed for bacterial cells, it has also been adapted for eukaryotic cells, and can be used for whole cell biosensors, or in real time with live animals without the need for euthanasia. However, correct interpretation of bioluminescent data is limited: the bioluminescence is different from gene expression because of nonlinear molecular and enzyme dynamics of the Lux system. We have developed a computational approach that, for the first time, allows users of Lux assays to infer gene transcription levels from the light output. This approach is based upon a new mathematical model for Lux activity, that includes the actions of LuxAB, LuxEC and Fre, with improved mechanisms for all reactions, as well as synthesis and turn-over of Lux proteins. The model is calibrated with new experimental data for the LuxAB and Fre reactions from Photorhabdus luminescens-the source of modern Lux reporters-while literature data has been used for LuxEC. Importantly, the data show clear evidence for previously unreported product inhibition for the LuxAB reaction. Model simulations show that predicted bioluminescent profiles can be very different from changes in gene expression, with transient peaks of light output, very similar to light output seen in some experimental data sets. By incorporating the calibrated model into a Bayesian inference scheme, we can reverse engineer promoter activity from the bioluminescence. We show examples where a decrease in bioluminescence would be better interpreted as a switching off of the promoter, or where an increase in bioluminescence would be better interpreted as a longer period of gene expression. This approach could benefit all users of Lux technology.

Developing a DNA barcode library for perciform fishes in the South China Sea: species identification, accuracy, and cryptic diversity.

DNA barcodes were studied for 1,353 specimens representing 272 morphological species belonging to 149 genera and 55 families of Perciformes from the South China Sea (SCS). The average Kimura 2-parameter (K2P) distances within species, genera, and families were 0.31%, 8.71%, and 14.52%, respectively. A Neighbor-joining (NJ) tree, Bayesian inference (BI) and maximum-likelihood (ML) trees and Automatic Barcode Gap Discovery (ABGD) revealed 260, 253 and 259 single-species-representing clusters, respectively. Barcoding Gap Analysis (BGA) demonstrated that barcode gaps were present for 178 of 187 species analysed with multiple specimens (95.2%), with the minimum interspecific distance to the nearest neighbor larger than the maximum intraspecific distance. A group of three Thunnus species (T. albacares, T. obesus and T. tonggol), a pair of Gerres species (G. oyena and G. japonicus), a pair of Istiblennius species (I. edentulous and I. lineatus), and a pair of Uranoscopus species (U. oligolepis and U. kaianus) were observed with low interspecific distances and overlaps between intra- and interspecific genetic distances. Three species (Apogon ellioti, Naucrates ductor and Psenopsis anomala) showed deep intraspecific divergences and generated two lineages each, suggesting the possibility of cryptic species. Our results demonstrated that DNA barcodes are highly reliable for delineating species of Perciformes in the SCS. The DNA barcode library established in this study will shed light on further research on the diversity of Perciformes in the SCS. This article is protected by copyright. All rights reserved.

A multilocus phylogeny of the genus Sarcohyla (Anura: Hylidae), and an investigation of species boundaries using statistical species delimitation.

The genus Sarcohyla is composed by 24 species endemic to México. Despite the large number of phylogenetic studies focusing on the family Hylidae, the relationships among the species of Sarcohyla are still poorly known, and the scarce numbers of specimens and tissue samples available for some of the species has hampered an appropriate phylogenetic analysis. We present the most comprehensive molecular phylogenetic study of Sarcohyla to date. We included 17 species of the genus Sarcohyla using data from two mitochondrial (ND1 and 12S) and three nuclear genes (Rag-1, Rhod, and POMc). We performed phylogenetic analyses using Bayesian inference, and the absence of conflicts with strong support between the separate gene trees indicates that incomplete lineage sorting and/or introgressive hybridization are negligible. A coalescent-based species-tree analysis of the four independent loci (three nuclear genes + mtDNA) mostly supports the same species-level relationships as the analysis of the concatenated data. By including new samples from additional species and localities, we find that: (1) the widely distributed species S. bistincta is a complex of at least three species, (2) another undescribed species exists in the group, (3) the species S. ephemera is not valid and it corresponds to a junior synonym of S. calthula. In addition, we conducted marginal likelihood estimation and used Bayes factors to test alternative species delimitation models for S. bistincta, the most widespread nominal species in the group. Our findings support three independent lineages of S. bistincta group, which are paraphyletic with respect to S. pentheter and S. calthula.

Inferring Within-Host Bottleneck Size: A Bayesian Approach.

Recent technical developments in microbiology have led to new discoveries on the within-host dynamics of bacterial infections in laboratory animals. In particular, they have highlighted the importance of stochastic bottlenecks at the onset of invasive disease. A number of approaches exist for bottleneck-size estimation with respect to within-host bacterial infections; however, some are more appropriate than others under certain circumstances. A Bayesian comparison of several approaches is made in terms of the availability of isogenic multitype bacteria (e.g., WITS), knowledge of post-bottleneck dynamics, and the suitability of dilution with monotype bacteria. A sampling approach to bottleneck-size estimation is also introduced. The results are summarised by a guiding flowchart, which we hope will promote the use of quantitative models in microbiology to refine the analysis of animal experiment data.

Donor-Recipient Identification in Para- and Poly-phyletic Trees Under Alternative HIV-1 Transmission Hypotheses Using Approximate Bayesian Computation.

Diversity of the founding population of Human Immunodeficiency Virus Type 1 (HIV-1) transmissions raises many important biological, clinical, and epidemiological issues. In up to 40% of sexual infections there is clear evidence for multiple founding variants, which can influence the efficacy of putative prevention methods and the reconstruction of epidemiologic histories. To infer who-infected-whom and to compute the probability of alternative transmission scenarios, while explicitly taking phylogenetic uncertainty into account, we created an Approximate Bayesian Computation (ABC) method based on a set of statistics measuring phylogenetic topology, branch lengths, and genetic diversity. We applied our method to a suspected heterosexual transmission case involving 3 individuals, showing a complex monophyletic-paraphyletic-polyphyletic phylogenetic topology. We detected that 7 phylogenetic lineages had been transmitted between two of the individuals based on the available samples, implying that many more unsampled lineages had also been transmitted. Testing whether the lineages had been transmitted at one time or over some length of time suggested that an ongoing super-infection process over several years was most likely. While one individual was found unlinked to the other two, surprisingly, when evaluating two competing epidemiological priors, the donor of the two that did infect each other was not identified by the host root-label, and was also not the primary suspect in that transmission. This highlights that it is important to take epidemiological information into account when analyzing support for one transmission hypothesis over another, as results may be non-intuitive and sensitive to details about sampling dates relative to possible infection dates. Our study provides a formal inference framework to include information on infection and sampling times, and to investigate ancestral node-label states, transmission direction, transmitted genetic diversity, and frequency of transmission.

Evolution of the sex ratio and effective number under gynodioecy and androdioecy.

We address the evolution of effective number of individuals under androdioecy and gynodioecy. We analyze dynamic models of autosomal modifiers of weak effect on sex expression. In our zygote control models, the sex expressed by a zygote depends on its own genotype, while in our maternal control models, it depends on the genotype of its maternal parent. Our analysis unifies full multi-dimensional local stability analysis with the Li-Price equation, which for all its heuristic appeal, describes evolutionary change over a single generation. We define a point in the neighborhood of a fixation state from which a single-generation step indicates the asymptotic behavior of the frequency of a modifier allele initiated at an arbitrary point near the fixation state. A concept of heritability appropriate for the evolutionary modification of sex emerges from the Li-Priceframework. We incorporate our theoretical analysis into our previously-developed Bayesian inference framework to develop a new method for inferring the viability of gonochores (males or females) relative to hermaphrodites. Applying this approach to microsatellite data derived from natural populations of the gynodioecious plant Schiedea salicaria and the androdioecious killifish Kryptolebias marmoratus, we find that while female and hermaphrodite S. salicaria appear to have similar viabilities, male K. marmoratus appear to survive to reproductive age at less than half the rate of hermaphrodites.

Visual Shape Perception as Bayesian Inference of 3D Object-Centered Shape Representations.

Despite decades of research, little is known about how people visually perceive object shape. We hypothesize that a promising approach to shape perception is provided by a "visual perception as Bayesian inference" framework which augments an emphasis on visual representation with an emphasis on the idea that shape perception is a form of statistical inference. Our hypothesis claims that shape perception of unfamiliar objects can be characterized as statistical inference of 3D shape in an object-centered coordinate system. We describe a computational model based on our theoretical framework, and provide evidence for the model along two lines. First, we show that, counterintuitively, the model accounts for viewpoint-dependency of object recognition, traditionally regarded as evidence against people's use of 3D object-centered shape representations. Second, we report the results of an experiment using a shape similarity task, and present an extensive evaluation of existing models' abilities to account for the experimental data. We find that our shape inference model captures subjects' behaviors better than competing models. Taken as a whole, our experimental and computational results illustrate the promise of our approach and suggest that people's shape representations of unfamiliar objects are probabilistic, 3D, and object-centered. (PsycINFO Database Record

Rapid curation of gene disruption collections using Knockout Sudoku.

Knockout Sudoku is a method for the construction of whole-genome knockout collections for a wide range of microorganisms with as little as 3 weeks of dedicated labor and at a cost of ∼$10,000 for a collection for a single organism. The method uses manual 4D combinatorial pooling, next-generation sequencing, and a Bayesian inference algorithm to rapidly process and then accurately annotate the extremely large progenitor transposon insertion mutant collections needed to achieve saturating coverage of complex microbial genomes. This method is ∼100× faster and 30× lower in cost than the next comparable method (In-seq) for annotating transposon mutant collections by combinatorial pooling and next-generation sequencing. This method facilitates the rapid, algorithmically guided condensation and curation of the progenitor collection into a high-quality, nonredundant collection that is suitable for rapid genetic screening and gene discovery.

Phylogenetic position and age of Lake Baikal candonids (Crustacea, Ostracoda) inferred from multigene sequence analyzes and molecular dating.

With 104 endemic species family Candonidae is one of the most diverse crustacean groups in Lake Baikal, yet their phylogenetic relationships and position in the family have not been addressed so far. Here, we study the phylogenetic position of Baikal candonids within the family and their evolutionary history using molecular markers for the first time since their original description. We choose 10 Baikal and 28 species from around the world, and three ribosomal RNA-s (18S, 28S, and 16S), and analyze individual and concatenated datasets using Bayesian Inference in MrBayes and BEAST. For molecular divergence time estimates, four fossil records are used to calibrate the root and three internal nodes. The 28S dataset is tested under the strict molecular clock, while for other data we use relaxed clocks. Resulting trees show incongruence between molecular and fossil divergence time estimates, with the former suggesting older ages. Strict molecular clock analysis results in narrower node age confidence intervals and younger time estimates than other analysis. All trees support at least two candonid lineages in Baikal, with two independent colonization events, and 28S suggests a major radiation between 12 and 5 Mya. This divergence time estimate mostly agrees with another, unrelated, ostracod group in the lake and other lake animals as well. Baikal candonid clades show a close phylogenetic relationship with Palearctic lineages, but their deep divergence is indicative of separate genera. Results also suggest a monophyly of tribes that today live exclusively in subterranean waters, and we offer several hypotheses of their evolutionary history.

Phylogenetic relationships of three representative sea krait species (genus Laticauda; elapidae; serpentes) based on 13 mitochondrial genes.

To investigate the phylogenetic relationships of the genus Laticauda to related higher taxa, we compared the sequences of four mitochondrial genes (12S rRNA, 16S rRNA, ND4, Cytb) from three Laticauda species (L. colubrina, L. laticaudata, and L. semifasciata) with those of 55 Asian and Australo-Melanesian elapid species. We also characterized the complete mitogenomes of the three Laticauda species and compared the sequences of 13 mitochondrial genes from Laticauda species with five terrestrial elapid and one viperid species to estimate phylogenetic relationships and divergence times. Our results showed that the genus Laticauda is paraphyletic to terrestrial elapids and diverged from the Asian elapids approximately 16.23 Mya. The mitogenomes of the three Laticauda species commonly encoded 13 proteins, 22 tRNAs, 12S and 16S rRNAs and two control regions and ranged from 17,170 and 17,450 bp in size. The L. colubrina mitogenome was more similar to that of L. laticaudata than that of L. semifasciata. The divergence time among the three Laticauda clades was estimated at 8-10 Mya, and a close phylogenetic relationship between L. colubrina and L. laticaudata was found. Our results contribute to our understanding of the evolutionary history of sea kraits.

A Nonparametric Multidimensional Latent Class IRT Model in a Bayesian Framework.

We propose a nonparametric item response theory model for dichotomously-scored items in a Bayesian framework. The model is based on a latent class (LC) formulation, and it is multidimensional, with dimensions corresponding to a partition of the items in homogenous groups that are specified on the basis of inequality constraints among the conditional success probabilities given the latent class. Moreover, an innovative system of prior distributions is proposed following the encompassing approach, in which the largest model is the unconstrained LC model. A reversible-jump type algorithm is described for sampling from the joint posterior distribution of the model parameters of the encompassing model. By suitably post-processing its output, we then make inference on the number of dimensions (i.e., number of groups of items measuring the same latent trait) and we cluster items according to the dimensions when unidimensionality is violated. The approach is illustrated by two examples on simulated data and two applications based on educational and quality-of-life data.

A Bayesian test for Hardy-Weinberg equilibrium of biallelic X-chromosomal markers.

The X chromosome is a relatively large chromosome, harboring a lot of genetic information. Much of the statistical analysis of X-chromosomal information is complicated by the fact that males only have one copy. Recently, frequentist statistical tests for Hardy-Weinberg equilibrium have been proposed specifically for dealing with markers on the X chromosome. Bayesian test procedures for Hardy-Weinberg equilibrium for the autosomes have been described, but Bayesian work on the X chromosome in this context is lacking. This paper gives the first Bayesian approach for testing Hardy-Weinberg equilibrium with biallelic markers at the X chromosome. Marginal and joint posterior distributions for the inbreeding coefficient in females and the male to female allele frequency ratio are computed, and used for statistical inference. The paper gives a detailed account of the proposed Bayesian test, and illustrates it with data from the 1000 Genomes project. In that implementation, a novel approach to tackle multiple testing from a Bayesian perspective through posterior predictive checks is used.

Appropriate homoplasy metrics in linked SSRs to predict an underestimation of demographic expansion times.

Homoplasy affects demographic inference estimates. This effect has been recognized and corrective methods have been developed. However, no studies so far have defined what homoplasy metrics best describe the effects on demographic inference, or have attempted to estimate such metrics in real data. Here we study how homoplasy in chloroplast microsatellites (cpSSR) affects inference of population expansion time. cpSSRs are popular markers for inferring historical demography in plants due to their high mutation rate and limited recombination.

Genetic diversity of porcine circovirus type 2 (PCV2) in Thailand during 2009-2015.

Porcine circovirus type 2 (PCV2), the essential cause of porcine circovirus associated disease (PCVAD), has evolved rapidly and it has been reported worldwide. However, genetic information of PCV2 in Thailand has not been available since 2011. Herein, we studied occurrence and genetic diversity of PCV2 in Thailand and their relationships to the global PCV2 based on ORF2 sequences. The results showed that 306 samples (44.09%) from 56 farms (80%) were PCV2 positive by PCR. Phylogenetic trees constructed by both neighbor-joining and Bayesian Inference yielded similar topology of the ORF2 sequences. Thai PCV2 comprise four clusters: PCV2a (5.5%), PCV2b (29.41%), intermediate clade 1 (IM1) PCV2b (11.03%) and PCV2d (54.41%). Genetic shift of PCV2 in Thailand has occurred similarly to the global situation. The shift from PCV2b to PCV2d was clearly observed during 2013-2014. The viruses with genetically similar to the first reported PCV2 in 2004 have still circulated in Thailand. The first Thai PCV2b and PCV2d were closely related to the neighboring countries. The haplotype network analysis revealed the relationship of PCV2 in Thailand and other countries. These results indicate that genetic diversity of PCV2 in Thailand is caused by genetic drift of the local strains and intermittent introduction of new strains or genotypes from other countries. Genetic evolution of PCV2 in Thailand is similar to that occurs globally.

A Contribution to the Morphology and Phylogeny of Chlamydodon, with Three New Species from China (Ciliophora, Cyrtophoria).

Three new cyrtophorian ciliates isolated from coastal areas of China were described based on morphological and genetic data. The Chlamydodon mnemosyne-like species Chlamydodon similis sp. n. differs from its congeners mainly by its number of somatic kineties. Chlamydodon oligochaetus sp. n. is distinguished from its congeners mainly by having fewer somatic kineties, and/or an elongated body shape. Chlamydodon crassidens sp. n. is characterized mainly by an inverted triangular body shape, a posteriorly interrupted cross-striated band (5-6 μm wide), and a large cytostome. Moreover, we provided small-subunit (SSU) rDNA sequences of C. similis sp. n. and C. oligochaetus sp. n. Maximum Likelihood (ML) and Bayesian inference (BI) consistently placed C. similis sp. n. as a sister to C. paramnemosyne, but showed different branching position of C. oligochaetus sp. n., which may be due to a low taxon sampling in the Chlamydodontidae and/or an insufficient resolution of the marker gene at species-level. This article is protected by copyright. All rights reserved.

Performing Arm-Based Network Meta-Analysis in R with the pcnetmeta Package.

Network meta-analysis is a powerful approach for synthesizing direct and indirect evidence about multiple treatment comparisons from a collection of independent studies. At present, the most widely used method in network meta-analysis is contrast-based, in which a baseline treatment needs to be specified in each study, and the analysis focuses on modeling relative treatment effects (typically log odds ratios). However, population-averaged treatment-specific parameters, such as absolute risks, cannot be estimated by this method without an external data source or a separate model for a reference treatment. Recently, an arm-based network meta-analysis method has been proposed, and the R package pcnetmeta provides user-friendly functions for its implementation. This package estimates both absolute and relative effects, and can handle binary, continuous, and count outcomes.

Probabilistic inference under time pressure leads to a cortical-to-subcortical shift in decision evidence integration.

Real-life decision-making often involves combining multiple probabilistic sources of information under finite time and cognitive resources. To mitigate these pressures, people "satisfice", foregoing a full evaluation of all available evidence to focus on a subset of cues that allow for fast and "good-enough" decisions. Although this form of decision-making likely mediates many of our everyday choices, very little is known about the way in which the neural encoding of cue information changes when we satisfice under time pressure. Here, we combined human functional magnetic resonance imaging (fMRI) with a probabilistic classification task to characterize neural substrates of multi-cue decision-making under low (1500 ms) and high (500 ms) time pressure. Using variational Bayesian inference, we analyzed participants' choices to track and quantify cue usage under each experimental condition, which was then applied to model the fMRI data. Under low time pressure, participants performed near-optimally, appropriately integrating all available cues to guide choices. Both cortical (prefrontal and parietal cortex) and subcortical (hippocampal and striatal) regions encoded individual cue weights, and activity linearly tracked trial-by-trial variations in the amount of evidence and decision uncertainty. Under increased time pressure, participants adaptively shifted to using a satisficing strategy by discounting the least informative cue in their decision process. This strategic change in decision-making was associated with an increased involvement of the dopaminergic midbrain, striatum, thalamus, and cerebellum in representing and integrating cue values. We conclude that satisficing the probabilistic inference process under time pressure leads to a cortical-to-subcortical shift in the neural drivers of decisions.

Application of Bayesian informative priors to enhance the transferability of safety performance functions.

Safety performance functions (SPFs) are essential tools for highway agencies to predict crashes, identify hotspots and assess safety countermeasures. In the Highway Safety Manual (HSM), a variety of SPFs are provided for different types of roadway facilities, crash types and severity levels. Agencies, lacking the necessary resources to develop own localized SPFs, may opt to apply the HSM's SPFs for their jurisdictions. Yet, municipalities that want to develop and maintain their regional SPFs might encounter the issue of the small sample bias. Bayesian inference is being conducted to address this issue by combining the current data with prior information to achieve reliable results. It follows that the essence of Bayesian statistics is the application of informative priors, obtained from other SPFs or experts' experiences.

Bayesian sensitivity analysis for unmeasured confounding in causal mediation analysis.

Causal mediation analysis techniques enable investigators to examine whether the effect of the exposure on an outcome is mediated by some intermediate variable. Motivated by a data example from epidemiology, we consider estimation of natural direct and indirect effects on a survival outcome. An important concern is bias from confounders that may be unmeasured. Estimating natural direct and indirect effects requires an elaborate series of assumptions in order to identify the target quantities. The analyst must carefully measure and adjust for important predictors of the exposure, mediator and outcome. Omitting important confounders may bias the results in a way that is difficult to predict. In recent years, several methods have been proposed to explore sensitivity to unmeasured confounding in mediation analysis. However, many of these methods limit complexity by relying on a handful of sensitivity parameters that are difficult to interpret, or alternatively, by assuming that specific patterns of unmeasured confounding are absent. Instead, we propose a simple Bayesian sensitivity analysis technique that is indexed by four bias parameters. Our method has the unique advantage that it is able to simultaneously assess unmeasured confounding in the mediator-outcome, exposure-outcome and exposure-mediator relationships. It is a natural Bayesian extension of the sensitivity analysis methodologies of VanderWeele, which have been widely used in the epidemiology literature. We present simulation findings, and additionally, we illustrate the method in an epidemiological study of mortality rates in criminal offenders from British Columbia.

Conformational Heterogeneity and FRET Data Interpretation for Dimensions of Unfolded Proteins.

A mathematico-physically valid formulation is required to infer properties of disordered protein conformations from single-molecule Förster resonance energy transfer (smFRET). Conformational dimensions inferred by conventional approaches that presume a homogeneous conformational ensemble can be unphysical. When all possible-heterogeneous as well as homogeneous-conformational distributions are taken into account without prejudgment, a single value of average transfer efficiency 〈E〉 between dyes at two chain ends is generally consistent with highly diverse, multiple values of the average radius of gyration 〈Rg〉. Here we utilize unbiased conformational statistics from a coarse-grained explicit-chain model to establish a general logical framework to quantify this fundamental ambiguity in smFRET inference. As an application, we address the long-standing controversy regarding the denaturant dependence of 〈Rg〉 of unfolded proteins, focusing on Protein L as an example. Conventional smFRET inference concluded that 〈Rg〉 of unfolded Protein L is highly sensitive to [GuHCl], but data from SAXS suggested a near-constant 〈Rg〉 irrespective of [GuHCl]. Strikingly, our analysis indicates that although the reported 〈E〉 values for Protein L at [GuHCl] = 1 and 7 M are very different at 0.75 and 0.45, respectively, the Bayesian Rg(2) distributions consistent with these two 〈E〉 values overlap by as much as 75%. Our findings suggest, in general, that the smFRET-SAXS discrepancy regarding unfolded protein dimensions likely arise from highly heterogeneous conformational ensembles at low or zero denaturant, and that additional experimental probes are needed to ascertain the nature of this heterogeneity.

Modeling conditional dependence among multiple diagnostic tests.

When multiple imperfect dichotomous diagnostic tests are applied to an individual, it is possible that some or all of their results remain dependent even after conditioning on the true disease status. The estimates could be biased if this conditional dependence is ignored when using the test results to infer about the prevalence of a disease or the accuracies of the diagnostic tests. However, statistical methods correcting for this bias by modelling higher-order conditional dependence terms between multiple diagnostic tests are not well addressed in the literature. This paper extends a Bayesian fixed effects model for 2 diagnostic tests with pairwise correlation to cases with 3 or more diagnostic tests with higher order correlations. Simulation results show that the proposed fixed effects model works well both in the case when the tests are highly correlated and in the case when the tests are truly conditionally independent, provided adequate external information is available in the form of fixed constraints or prior distributions. A data set on the diagnosis of childhood pulmonary tuberculosis is used to illustrate the proposed model.

Molecular evolution and phylogeography of infectious hematopoietic necrosis virus with a focus on its presence in France over the last 30 years.

Infectious hematopoietic necrosis virus (IHNV) is among the most important pathogens affecting the salmonid industry. Here, we investigated the molecular evolution and circulation of isolates from 11 countries or regions all over the world, with a special focus on the epidemiological situation in France. The phylogeography, time to the most recent common ancestor (TMRCA) and nucleotide substitution rate were studied using 118 full-length glycoprotein gene sequences isolated from 9 countries (5 genogroups) over a period of 47 years. The TMRCA dates back to 1943, with the L genogroup identified as the likely root (67 %), which is consistent with the first report of this pathogen in the USA. A Bayesian inference approach was applied to the partial glycoprotein gene sequences of 88 representative strains isolated in France over the period 1987-2015. The genetic diversity of these 88 sequences showed mean nucleotide and amino-acid identities of 97.1 and 97.8 %, respectively, and a dN/dS ratio (non-synonymous to synonymous mutations) of 0.25, indicating purifying selection. The French viral populations are divided into eight sub-clades and four individual isolates, with a clear spatial differentiation, suggesting the predominant role of local reservoirs in contamination. The atypical 'signatures' of some isolates underlined the usefulness of molecular phylogeny for epidemiological investigations that track the spread of IHNV.

BayFish: Bayesian inference of transcription dynamics from population snapshots of single-molecule RNA FISH in single cells.

Single-molecule RNA fluorescence in situ hybridization (smFISH) provides unparalleled resolution in the measurement of the abundance and localization of nascent and mature RNA transcripts in fixed, single cells. We developed a computational pipeline (BayFish) to infer the kinetic parameters of gene expression from smFISH data at multiple time points after gene induction. Given an underlying model of gene expression, BayFish uses a Monte Carlo method to estimate the Bayesian posterior probability of the model parameters and quantify the parameter uncertainty given the observed smFISH data. We tested BayFish on synthetic data and smFISH measurements of the neuronal activity-inducible gene Npas4 in primary neurons.