A site to transform Pubmed publications into these bibliographic reference formats: ADS, BibTeX, EndNote, ISI used by the Web of Knowledge, RIS, MEDLINE, Microsoft's Word 2007 XML.

Models, Statistical - Top 30 Publications

Prediction models for endometrial cancer for the general population or symptomatic women: A systematic review.

To provide an overview of prediction models for the risk of developing endometrial cancer in women of the general population or for the presence of endometrial cancer in symptomatic women.

Prognostic and predictive factors in patients with brain metastases from solid tumors: A review of published nomograms.

To review published nomograms that predict endpoints such as overall survival (OS) or risk of intracranial relapse in patients with brain metastases from solid tumors.

Estimating virus occurrence using Bayesian modeling in multiple drinking water systems of the United States.

Drinking water treatment plants rely on purification of contaminated source waters to provide communities with potable water. One group of possible contaminants are enteric viruses. Measurement of viral quantities in environmental water systems are often performed using polymerase chain reaction (PCR) or quantitative PCR (qPCR). However, true values may be underestimated due to challenges involved in a multi-step viral concentration process and due to PCR inhibition. In this study, water samples were concentrated from 25 drinking water treatment plants (DWTPs) across the US to study the occurrence of enteric viruses in source water and removal after treatment. The five different types of viruses studied were adenovirus, norovirus GI, norovirus GII, enterovirus, and polyomavirus. Quantitative PCR was performed on all samples to determine presence or absence of these viruses in each sample. Ten DWTPs showed presence of one or more viruses in source water, with four DWTPs having treated drinking water testing positive. Furthermore, PCR inhibition was assessed for each sample using an exogenous amplification control, which indicated that all of the DWTP samples, including source and treated water samples, had some level of inhibition, confirming that inhibition plays an important role in PCR-based assessments of environmental samples. PCR inhibition measurements, viral recovery, and other assessments were incorporated into a Bayesian model to more accurately determine viral load in both source and treated water. Results of the Bayesian model indicated that viruses are present in source water and treated water. By using a Bayesian framework that incorporates inhibition, as well as many other parameters that affect viral detection, this study offers an approach for more accurately estimating the occurrence of viral pathogens in environmental waters.

Factors associated with preventable infant death: a multiple logistic regression.

OBJECTIVE To identify and analyze factors associated with preventable child deaths. METHODS This analytical cross-sectional study had preventable child mortality as dependent variable. From a population of 34,284 live births, we have selected a systematic sample of 4,402 children who did not die compared to 272 children who died from preventable causes during the period studied. The independent variables were analyzed in four hierarchical blocks: sociodemographic factors, the characteristics of the mother, prenatal and delivery care, and health conditions of the patient and neonatal care. We performed a descriptive statistical analysis and estimated multiple hierarchical logistic regression models. RESULTS Approximatelly 35.3% of the deaths could have been prevented with the early diagnosis and treatment of diseases during pregnancy and 26.8% of them could have been prevented with better care conditions for pregnant women. CONCLUSIONS The following characteristics of the mother are determinant for the higher mortality of children before the first year of life: living in neighborhoods with an average family income lower than four minimum wages, being aged ≤ 19 years, having one or more alive children, having a child with low APGAR level at the fifth minute of life, and having a child with low birth weight.

Selecting Health States for EQ-5D-3L Valuation Studies: Statistical Considerations Matter.

For many countries, the three-level EuroQol five-dimensional questionnaire (EQ-5D-3L) value sets have been established to estimate health state utilities. To generate these value sets, researchers first collect values for a subset of preselected health states from a panel representing the general public, and then use a prediction algorithm to generate values for all 243 states. High prevalence of a health state in daily practice has historically been a key criterion in selecting a subset of health states as the observed set. More recently, other criteria have been suggested, especially approaches based on statistical criteria such as randomization and orthogonality.

A Framework for Measuring Low-Value Care.

It has been estimated that more than 30% of health care spending in the United States is wasteful, and that low-value care, which drives up costs unnecessarily while increasing patient risk, is a significant component of wasteful spending.

Statistical tools application on dextranase production from Pochonia chlamydosporia (VC4) and its application on dextran removal from sugarcane juice.

The aim of this study was to optimize the dextranase production by fungus Pochonia chlamydosporia (VC4) and evaluate its activity in dextran reduction in sugarcane juice. The effects, over the P. chlamydosporia dextranase production, of different components from the culture medium were analyzed by Plackett-Burman design and central composite design. The response surface was utilized to determine the levels that, among the variables that influence dextranase production, provide higher production of these enzymes. The enzymatic effect on the removal of dextran present in sugarcane juice was also evaluated. It was observed that only NaNO3 and pH showed significant effect (p<0.05) over dextranase production and was determined that the levels which provided higher enzyme production were, respectively, 5 g/L and 5.5. The dextranases produced by fungus P. chlamydosporia reduced by 75% the dextran content of the sugarcane juice once treated for 12 hours, when compared to the control treatment.

Estimating Stand Height and Tree Density in Pinus taeda plantations using in-situ data, airborne LiDAR and k-Nearest Neighbor Imputation.

Accurate forest inventory is of great economic importance to optimize the entire supply chain management in pulp and paper companies. The aim of this study was to estimate stand dominate and mean heights (HD and HM) and tree density (TD) of Pinus taeda plantations located in South Brazil using in-situ measurements, airborne Light Detection and Ranging (LiDAR) data and the non- k-nearest neighbor (k-NN) imputation. Forest inventory attributes and LiDAR derived metrics were calculated at 53 regular sample plots and we used imputation models to retrieve the forest attributes at plot and landscape-levels. The best LiDAR-derived metrics to predict HD, HM and TD were H99TH, HSD, SKE and HMIN. The Imputation model using the selected metrics was more effective for retrieving height than tree density. The model coefficients of determination (adj.R2) and a root mean squared difference (RMSD) for HD, HM and TD were 0.90, 0.94, 0.38m and 6.99, 5.70, 12.92%, respectively. Our results show that LiDAR and k-NN imputation can be used to predict stand heights with high accuracy in Pinus taeda. However, furthers studies need to be realized to improve the accuracy prediction of TD and to evaluate and compare the cost of acquisition and processing of LiDAR data against the conventional inventory procedures.

Cox-nnet: An artificial neural network method for prognosis prediction of high-throughput omics data.

Artificial neural networks (ANN) are computing architectures with many interconnections of simple neural-inspired computing elements, and have been applied to biomedical fields such as imaging analysis and diagnosis. We have developed a new ANN framework called Cox-nnet to predict patient prognosis from high throughput transcriptomics data. In 10 TCGA RNA-Seq data sets, Cox-nnet achieves the same or better predictive accuracy compared to other methods, including Cox-proportional hazards regression (with LASSO, ridge, and mimimax concave penalty), Random Forests Survival and CoxBoost. Cox-nnet also reveals richer biological information, at both the pathway and gene levels. The outputs from the hidden layer node provide an alternative approach for survival-sensitive dimension reduction. In summary, we have developed a new method for accurate and efficient prognosis prediction on high throughput data, with functional biological insights. The source code is freely available at

The Proposal to Lower P Value Thresholds to .005.

Hazards of Hazard Ratios - Deviations from Model Assumptions in Immunotherapy.

How one might miss early warning signals of critical transitions in time series data: A systematic study of two major currency pairs.

There is growing interest in the use of critical slowing down and critical fluctuations as early warning signals for critical transitions in different complex systems. However, while some studies found them effective, others found the opposite. In this paper, we investigated why this might be so, by testing three commonly used indicators: lag-1 autocorrelation, variance, and low-frequency power spectrum at anticipating critical transitions in the very-high-frequency time series data of the Australian Dollar-Japanese Yen and Swiss Franc-Japanese Yen exchange rates. Besides testing rising trends in these indicators at a strict level of confidence using the Kendall-tau test, we also required statistically significant early warning signals to be concurrent in the three indicators, which must rise to appreciable values. We then found for our data set the optimum parameters for discovering critical transitions, and showed that the set of critical transitions found is generally insensitive to variations in the parameters. Suspecting that negative results in the literature are the results of low data frequencies, we created time series with time intervals over three orders of magnitude from the raw data, and tested them for early warning signals. Early warning signals can be reliably found only if the time interval of the data is shorter than the time scale of critical transitions in our complex system of interest. Finally, we compared the set of time windows with statistically significant early warning signals with the set of time windows followed by large movements, to conclude that the early warning signals indeed provide reliable information on impending critical transitions. This reliability becomes more compelling statistically the more events we test.

Application of an artificial neural network model for diagnosing type 2 diabetes mellitus and determining the relative importance of risk factors.

To identify the most important demographic risk factors for a diagnosis of type 2 diabetes mellitus (T2DM) using a neural network model.

Risk factor analysis of the patients with solitary pulmonary nodules and establishment of a prediction model for the probability of malignancy.