A site to transform Pubmed publications into these bibliographic reference formats: ADS, BibTeX, EndNote, ISI used by the Web of Knowledge, RIS, MEDLINE, Microsoft's Word 2007 XML.

Algorithms - Top 30 Publications

Beware of the origin of numbers: Standard scoring of the SF-12 and SF-36 summary measures distorts measurement and score interpretations.

The 12-item Short Form Health Survey (SF-12) is a generic health rating scale developed to reproduce the Physical and Mental Component Summary scores (PCS and MCS, respectively) of a longer survey, the SF-36. The standard PCS/MCS scoring algorithm has been criticized because its expected dimensionality often lacks empirical support, scoring is based on the assumption that physical and mental health are uncorrelated, and because scores on physical health items influence MCS scores, and vice versa. In this paper, we review the standard PCS/MCS scoring algorithm for the SF-12 and consider alternative scoring procedures: the RAND-12 Health Status Inventory (HSI) and raw sum scores. We corroborate that the SF-12 reproduces SF-36 scores but also inherits its problems. In simulations, good physical health scores reduce mental health scores, and vice versa. This may explain results of clinical studies in which, for example, poor physical health scores result in good MCS scores despite compromised mental health. When applied to empirical data from people with Parkinson's disease (PD) and stroke, standard SF-12 scores suggest a weak correlation between physical and mental health (rs .16), whereas RAND-12 HSI and raw sum scores show a much stronger correlation (rs .67-.68). Furthermore, standard PCS scores yield a different statistical conclusion regarding the association between physical health and age than do RAND-12 HSI and raw sum scores. We recommend that the standard SF-12 scoring algorithm be abandoned in favor of alternatives that provide more valid representations of physical and mental health, of which raw sum scores appear the simplest.

The tradition algorithm approach underestimates the prevalence of serodiagnosis of syphilis in HIV-infected individuals.

Currently, there are three algorithms for screening of syphilis: traditional algorithm, reverse algorithm and European Centre for Disease Prevention and Control (ECDC) algorithm. To date, there is not a generally recognized diagnostic algorithm. When syphilis meets HIV, the situation is even more complex. To evaluate their screening performance and impact on the seroprevalence of syphilis in HIV-infected individuals, we conducted a cross-sectional study included 865 serum samples from HIV-infected patients in a tertiary hospital. Every sample (one per patient) was tested with toluidine red unheated serum test (TRUST), T. pallidum particle agglutination assay (TPPA), and Treponema pallidum enzyme immunoassay (TP-EIA) according to the manufacturer's instructions. The results of syphilis serological testing were interpreted following different algorithms respectively. We directly compared the traditional syphilis screening algorithm with the reverse syphilis screening algorithm in this unique population. The reverse algorithm achieved remarkable higher seroprevalence of syphilis than the traditional algorithm (24.9% vs. 14.2%, p < 0.0001). Compared to the reverse algorithm, the traditional algorithm also had a missed serodiagnosis rate of 42.8%. The total percentages of agreement and corresponding kappa values of tradition and ECDC algorithm compared with those of reverse algorithm were as follows: 89.4%,0.668; 99.8%, 0.994. There was a very good strength of agreement between the reverse and the ECDC algorithm. Our results supported the reverse (or ECDC) algorithm in screening of syphilis in HIV-infected populations. In addition, our study demonstrated that screening of HIV-populations using different algorithms may result in a statistically different seroprevalence of syphilis.

A "Patch" to the NYU Emergency Department Visit Algorithm.

To document erosion in the New York University Emergency Department (ED) visit algorithm's capability to classify ED visits and to provide a "patch" to the algorithm.

The accuracy of HIV rapid testing in integrated bio-behavioral surveys of men who have sex with men across 5 Provinces in South Africa.

We describe the accuracy of serial rapid HIV testing among men who have sex with men (MSM) in South Africa and discuss the implications for HIV testing and prevention.This was a cross-sectional survey conducted at five stand-alone facilities from five provinces.Demographic, behavioral, and clinical data were collected. Dried blood spots were obtained for HIV-related testing. Participants were offered rapid HIV testing using 2 rapid diagnostic tests (RDTs) in series. In the laboratory, reference HIV testing was conducted using a third-generation enzyme immunoassay (EIA) and a fourth-generation EIA as confirmatory. Accuracy, sensitivity, specificity, positive predictive value, negative predictive value, false-positive, and false-negative rates were determined.Between August 2015 and July 2016, 2503 participants were enrolled. Of these, 2343 were tested by RDT on site with a further 2137 (91.2%) having definitive results on both RDT and EIA. Sensitivity, specificity, positive predictive value, negative predictive value, false-positive rates, and false-negative rates were 92.6% [95% confidence interval (95% CI) 89.6-94.8], 99.4% (95% CI 98.9-99.7), 97.4% (95% CI 95.2-98.6), 98.3% (95% CI 97.6-98.8), 0.6% (95% CI 0.3-1.1), and 7.4% (95% CI 5.2-10.4), respectively. False negatives were similar to true positives with respect to virological profiles.Overall accuracy of the RDT algorithm was high, but sensitivity was lower than expected. Post-HIV test counseling should include discussions of possible false-negative results and the need for retesting among HIV negatives.

Artificial intelligence: AI zooms in on highly influential citations.

Prediction of Anti-VEGF Treatment Requirements in Neovascular AMD Using a Machine Learning Approach.

The purpose of this study was to predict low and high anti-VEGF injection requirements during a pro re nata (PRN) treatment, based on sets of optical coherence tomography (OCT) images acquired during the initiation phase in neovascular AMD.

Machine Learning of the Progression of Intermediate Age-Related Macular Degeneration Based on OCT Imaging.

To develop a data-driven interpretable predictive model of incoming drusen regression as a sign of disease activity and identify optical coherence tomography (OCT) biomarkers associated with its risk in intermediate age-related macular degeneration (AMD).

Machine Learning and Prediction in Medicine - Beyond the Peak of Inflated Expectations.

Extraction frequencies.

Diagnostic algorithm for relapsing acquired demyelinating syndromes in children.

To establish whether children with relapsing acquired demyelinating syndromes (RDS) and myelin oligodendrocyte glycoprotein antibodies (MOG-Ab) show distinctive clinical and radiologic features and to generate a diagnostic algorithm for the main RDS for clinical use.

A Fast and Accurate Algorithm to Test for Binary Phenotypes and Its Application to PheWAS.

The availability of electronic health record (EHR)-based phenotypes allows for genome-wide association analyses in thousands of traits and has great potential to enable identification of genetic variants associated with clinical phenotypes. We can interpret the phenome-wide association study (PheWAS) result for a single genetic variant by observing its association across a landscape of phenotypes. Because a PheWAS can test thousands of binary phenotypes, and most of them have unbalanced or often extremely unbalanced case-control ratios (1:10 or 1:600, respectively), existing methods cannot provide an accurate and scalable way to test for associations. Here, we propose a computationally fast score-test-based method that estimates the distribution of the test statistic by using the saddlepoint approximation. Our method is much (∼100 times) faster than the state-of-the-art Firth's test. It can also adjust for covariates and control type I error rates even when the case-control ratio is extremely unbalanced. Through application to PheWAS data from the Michigan Genomics Initiative, we show that the proposed method can control type I error rates while replicating previously known association signals even for traits with a very small number of cases and a large number of controls.

Internal quality control on HER2 status determination in breast cancers: Experience of a cancer center.

The implementation of an internal quality control is mandatory to guarantee the accuracy of HER2 status in invasive breast cancers.

Inference in the Brain: Statistics Flowing in Redundant Population Codes.

It is widely believed that the brain performs approximate probabilistic inference to estimate causal variables in the world from ambiguous sensory data. To understand these computations, we need to analyze how information is represented and transformed by the actions of nonlinear recurrent neural networks. We propose that these probabilistic computations function by a message-passing algorithm operating at the level of redundant neural populations. To explain this framework, we review its underlying concepts, including graphical models, sufficient statistics, and message-passing, and then describe how these concepts could be implemented by recurrently connected probabilistic population codes. The relevant information flow in these networks will be most interpretable at the population level, particularly for redundant neural codes. We therefore outline a general approach to identify the essential features of a neural message-passing algorithm. Finally, we argue that to reveal the most important aspects of these neural computations, we must study large-scale activity patterns during moderately complex, naturalistic behaviors.

Socially Assistive Robots Help Patients Make Behavioral Changes.

A review of active learning approaches to experimental design for uncovering biological networks.

Various types of biological knowledge describe networks of interactions among elementary entities. For example, transcriptional regulatory networks consist of interactions among proteins and genes. Current knowledge about the exact structure of such networks is highly incomplete, and laboratory experiments that manipulate the entities involved are conducted to test hypotheses about these networks. In recent years, various automated approaches to experiment selection have been proposed. Many of these approaches can be characterized as active machine learning algorithms. Active learning is an iterative process in which a model is learned from data, hypotheses are generated from the model to propose informative experiments, and the experiments yield new data that is used to update the model. This review describes the various models, experiment selection strategies, validation techniques, and successful applications described in the literature; highlights common themes and notable distinctions among methods; and identifies likely directions of future research and open problems in the area.

Transition index maps for urban growth simulation: application of artificial neural networks, weight of evidence and fuzzy multi-criteria evaluation.

Transition index maps (TIMs) are key products in urban growth simulation models. However, their operationalization is still conflicting. Our aim was to compare the prediction accuracy of three TIM-based spatially explicit land cover change (LCC) models in the mega city of Mumbai, India. These LCC models include two data-driven approaches, namely artificial neural networks (ANNs) and weight of evidence (WOE), and one knowledge-based approach which integrates an analytical hierarchical process with fuzzy membership functions (FAHP). Using the relative operating characteristics (ROC), the performance of these three LCC models were evaluated. The results showed 85%, 75%, and 73% accuracy for the ANN, FAHP, and WOE. The ANN was clearly superior compared to the other LCC models when simulating urban growth for the year 2010; hence, ANN was used to predict urban growth for 2020 and 2030. Projected urban growth maps were assessed using statistical measures, including figure of merit, average spatial distance deviation, producer accuracy, and overall accuracy. Based on our findings, we recomend ANNs as an and accurate method for simulating future patterns of urban growth.

Artificial Intelligence in Precision Cardiovascular Medicine.

Artificial intelligence (AI) is a field of computer science that aims to mimic human thought processes, learning capacity, and knowledge storage. AI techniques have been applied in cardiovascular medicine to explore novel genotypes and phenotypes in existing diseases, improve the quality of patient care, enable cost-effectiveness, and reduce readmission and mortality rates. Over the past decade, several machine-learning techniques have been used for cardiovascular disease diagnosis and prediction. Each problem requires some degree of understanding of the problem, in terms of cardiovascular medicine and statistics, to apply the optimal machine-learning algorithm. In the near future, AI will result in a paradigm shift toward precision cardiovascular medicine. The potential of AI in cardiovascular medicine is tremendous; however, ignorance of the challenges may overshadow its potential clinical impact. This paper gives a glimpse of AI's application in cardiovascular clinical care and discusses its potential role in facilitating precision cardiovascular medicine.

An algorithm for identifying mothball composition().

Unintentional mothball ingestions may cause serious toxicity in small children. Camphor, naphthalene, and paradichlorobenzene mothballs are difficult to distinguish without packaging. Symptoms and management differ based on the ingested compound. Previous studies have used a variety of antiquated, impractical and potentially dangerous techniques to identify the mothballs. The goal of this study is to discover a simplified identification technique using materials readily available in an emergency department.

Development of MODIS data-based algorithm for retrieving sea surface temperature in coastal waters.

A new algorithm was developed for retrieving sea surface temperature (SST) in coastal waters using satellite remote sensing data from Moderate Resolution Imaging Spectroradiometer (MODIS) aboard Aqua platform. The new SST algorithm was trained using the Artificial Neural Network (ANN) method and tested using 8 years of remote sensing data from MODIS Aqua sensor and in situ sensing data from the US coastal waters in Louisiana, Texas, Florida, California, and New Jersey. The ANN algorithm could be utilized to map SST in both deep offshore and particularly shallow nearshore waters at the high spatial resolution of 1 km, greatly expanding the coverage of remote sensing-based SST data from offshore waters to nearshore waters. Applications of the ANN algorithm require only the remotely sensed reflectance values from the two MODIS Aqua thermal bands 31 and 32 as input data. Application results indicated that the ANN algorithm was able to explaining 82-90% variations in observed SST in US coastal waters. While the algorithm is generally applicable to the retrieval of SST, it works best for nearshore waters where important coastal resources are located and existing algorithms are either not applicable or do not work well, making the new ANN-based SST algorithm unique and particularly useful to coastal resource management.

In silico and cell-based analyses reveal strong divergence between prediction and observation of T-cell-recognized tumor antigen T-cell epitopes.

Tumor exomes provide comprehensive information on mutated, overexpressed genes and aberrant splicing, which can be exploited for personalized cancer immunotherapy. Of particular interest are mutated tumor antigen T-cell epitopes, because neoepitope-specific T cells often are tumoricidal. However, identifying tumor-specific T-cell epitopes is a major challenge. A widely used strategy relies on initial prediction of human leukocyte antigen-binding peptides by in silico algorithms, but the predictive power of this approach is unclear. Here, we used the human tumor antigen NY-ESO-1 (ESO) and the human leukocyte antigen variant HLA-A*0201 (A2) as a model and predicted in silico the 41 highest-affinity, A2-binding 8-11-mer peptides and assessed their binding, kinetic complex stability, and immunogenicity in A2-transgenic mice and on peripheral blood mononuclear cells from ESO-vaccinated melanoma patients. We found that 19 of the peptides strongly bound to A2, 10 of which formed stable A2-peptide complexes and induced CD8(+) T cells in A2-transgenic mice. However, only 5 of the peptides induced cognate T cells in humans; these peptides exhibited strong binding and complex stability and contained multiple large hydrophobic and aromatic amino acids. These results were not predicted by in silico algorithms and provide new clues to improving T-cell epitope identification. In conclusion, our findings indicate that only a small fraction of in silico-predicted A2-binding ESO peptides are immunogenic in humans, namely those that have high peptide-binding strength and complex stability. This observation highlights the need for improving in silico predictions of peptide immunogenicity.

Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: prospective cohort study.

Objectives To develop and validate updated QRISK3 prediction algorithms to estimate the 10 year risk of cardiovascular disease in women and men accounting for potential new risk factors.Design Prospective open cohort study.Setting General practices in England providing data for the QResearch database.Participants 1309 QResearch general practices in England: 981 practices were used to develop the scores and a separate set of 328 practices were used to validate the scores. 7.89 million patients aged 25-84 years were in the derivation cohort and 2.67 million patients in the validation cohort. Patients were free of cardiovascular disease and not prescribed statins at baseline.Methods Cox proportional hazards models in the derivation cohort to derive separate risk equations in men and women for evaluation at 10 years. Risk factors considered included those already in QRISK2 (age, ethnicity, deprivation, systolic blood pressure, body mass index, total cholesterol: high density lipoprotein cholesterol ratio, smoking, family history of coronary heart disease in a first degree relative aged less than 60 years, type 1 diabetes, type 2 diabetes, treated hypertension, rheumatoid arthritis, atrial fibrillation, chronic kidney disease (stage 4 or 5)) and new risk factors (chronic kidney disease (stage 3, 4, or 5), a measure of systolic blood pressure variability (standard deviation of repeated measures), migraine, corticosteroids, systemic lupus erythematosus (SLE), atypical antipsychotics, severe mental illness, and HIV/AIDs). We also considered erectile dysfunction diagnosis or treatment in men. Measures of calibration and discrimination were determined in the validation cohort for men and women separately and for individual subgroups by age group, ethnicity, and baseline disease status.Main outcome measures Incident cardiovascular disease recorded on any of the following three linked data sources: general practice, mortality, or hospital admission records.Results 363 565 incident cases of cardiovascular disease were identified in the derivation cohort during follow-up arising from 50.8 million person years of observation. All new risk factors considered met the model inclusion criteria except for HIV/AIDS, which was not statistically significant. The models had good calibration and high levels of explained variation and discrimination. In women, the algorithm explained 59.6% of the variation in time to diagnosis of cardiovascular disease (R(2), with higher values indicating more variation), and the D statistic was 2.48 and Harrell's C statistic was 0.88 (both measures of discrimination, with higher values indicating better discrimination). The corresponding values for men were 54.8%, 2.26, and 0.86. Overall performance of the updated QRISK3 algorithms was similar to the QRISK2 algorithms.Conclusion Updated QRISK3 risk prediction models were developed and validated. The inclusion of additional clinical variables in QRISK3 (chronic kidney disease, a measure of systolic blood pressure variability (standard deviation of repeated measures), migraine, corticosteroids, SLE, atypical antipsychotics, severe mental illness, and erectile dysfunction) can help enable doctors to identify those at most risk of heart disease and stroke.

A fast algorithm for Bayesian multi-locus model in genome-wide association studies.

Genome-wide association studies (GWAS) have identified a large amount of single-nucleotide polymorphisms (SNPs) associated with complex traits. A recently developed linear mixed model for estimating heritability by simultaneously fitting all SNPs suggests that common variants can explain a substantial fraction of heritability, which hints at the low power of single variant analysis typically used in GWAS. Consequently, many multi-locus shrinkage models have been proposed under a Bayesian framework. However, most use Markov Chain Monte Carlo (MCMC) algorithm, which are time-consuming and challenging to apply to GWAS data. Here, we propose a fast algorithm of Bayesian adaptive lasso using variational inference (BAL-VI). Extensive simulations and real data analysis indicate that our model outperforms the well-known Bayesian lasso and Bayesian adaptive lasso models in accuracy and speed. BAL-VI can complete a simultaneous analysis of a lung cancer GWAS data with ~3400 subjects and ~570,000 SNPs in about half a day.

Single Particle Tracking: From Theory to Biophysical Applications.

After three decades of developments, single particle tracking (SPT) has become a powerful tool to interrogate dynamics in a range of materials including live cells and novel catalytic supports because of its ability to reveal dynamics in the structure-function relationships underlying the heterogeneous nature of such systems. In this review, we summarize the algorithms behind, and practical applications of, SPT. We first cover the theoretical background including particle identification, localization, and trajectory reconstruction. General instrumentation and recent developments to achieve two- and three-dimensional subdiffraction localization and SPT are discussed. We then highlight some applications of SPT to study various biological and synthetic materials systems. Finally, we provide our perspective regarding several directions for future advancements in the theory and application of SPT.

Direct Comparison of 2 Rule-Out Strategies for Acute Myocardial Infarction: 2-h Accelerated Diagnostic Protocol vs 2-h Algorithm.

We compared 2 high-sensitivity cardiac troponin (hs-cTn)-based 2-h strategies in patients presenting with suspected acute myocardial infarction (AMI) to the emergency department (ED): the 2-h accelerated diagnostic protocol (2h-ADP) combining hs-cTn, electrocardiogram, and a risk score, and the 2-h algorithm exclusively based on hs-cTn concentrations and their absolute changes.

The mathematical approaches to differential diagnostics of acute pharyngeal diseases.

The objective of the present study was to elaborate the program for differential diagnostics of acute pharyngeal diseases based on the 'ENT-Neuro' artificial neuronal network. The study group was formed by means of sampling patients with acute pharyngeal diseases from a set of case histories of the subjects presenting with acute inflammatory diseases. The data thus obtained were employed to develop the expert system to support the decision making process with the use of the 'ENT-Neuro' artificial neuronal network that allows to carry out diagnostics of various inflammatory diseases of the pharynx including the following nosological entities: paratonsillitis, parapharyngitis, acute tonsillitis, and acute pharyngitis, with the minimal probability of erroneous diagnosis (4%). The proposed expert system makes it possible to choose the optimal treatment strategy for the management of various pharyngeal diseases taking into consideration the severity of a concrete pathology and thereby to reduce to a minimum the risk of the related complications.

A framework for evaluating epidemic forecasts.

Over the past few decades, numerous forecasting methods have been proposed in the field of epidemic forecasting. Such methods can be classified into different categories such as deterministic vs. probabilistic, comparative methods vs. generative methods, and so on. In some of the more popular comparative methods, researchers compare observed epidemiological data from the early stages of an outbreak with the output of proposed models to forecast the future trend and prevalence of the pandemic. A significant problem in this area is the lack of standard well-defined evaluation measures to select the best algorithm among different ones, as well as for selecting the best possible configuration for a particular algorithm.

Metal Artifact Reduction on Chest Computed Tomography Examinations: Comparison of the Iterative Metallic Artefact Reduction Algorithm and the Monoenergetic Approach.

The aim of the study was to compare iterative metallic artefact reduction (iMAR) and monochromatic imaging on metal artifact reduction.

Effect of a Low-Rank Denoising Algorithm on Quantitative Magnetic Resonance Imaging-Based Measures of Liver Fat and Iron.

This study aimed to assess the effect of a low-rank denoising algorithm on quantitative magnetic resonance imaging-based measures of liver fat and iron.

Gas chromatography - mass spectrometry data processing made easy.

Evaluation of GC-MS data may be challenging due to the high complexity of data including overlapped, embedded, retention time shifted and low S/N ratio peaks. In this work, we demonstrate a new approach, PARAFAC2 based Deconvolution and Identification System (PARADISe), for processing raw GC-MS data. PARADISe is a computer platform independent freely available software incorporating a number of newly developed algorithms in a coherent framework. It offers a solution for analysts dealing with complex chromatographic data. It allows extraction of chemical/metabolite information directly from the raw data. Using PARADISe requires only few inputs from the analyst to process GC-MS data and subsequently converts raw netCDF data files into a compiled peak table. Furthermore, the method is generally robust towards minor variations in the input parameters. The method automatically performs peak identification based on deconvoluted mass spectra using integrated NIST search engine and generates an identification report. In this paper, we compare PARADISe with AMDIS and ChromaTOF in terms of peak quantification and show that PARADISe is more robust to user-defined settings and that these are easier (and much fewer) to set. PARADISe is based on non-proprietary scientifically evaluated approaches and we here show that PARADISe can handle more overlapping signals, lower signal-to-noise peaks and do so in a manner that requires only about an hours worth of work regardless of the number of samples. We also show that there are no non-detects in PARADISe, meaning that all compounds are detected in all samples.

Development and Evaluation of the National Cancer Institute's Dietary Screener Questionnaire Scoring Algorithms.

Background: Methods for improving the utility of short dietary assessment instruments are needed.Objective: We sought to describe the development of the NHANES Dietary Screener Questionnaire (DSQ) and its scoring algorithms and performance.Methods: The 19-item DSQ assesses intakes of fruits and vegetables, whole grains, added sugars, dairy, fiber, and calcium. Two nonconsecutive 24-h dietary recalls and the DSQ were administered in NHANES 2009-2010 to respondents aged 2-69 y (n = 7588). The DSQ frequency responses, coupled with sex- and age-specific portion size information, were regressed on intake from 24-h recalls by using the National Cancer Institute usual intake method to obtain scoring algorithms to estimate mean and prevalences of reaching 2 a priori threshold levels. The resulting scoring algorithms were applied to the DSQ and compared with intakes estimated with the 24-h recall data only. The stability of the derived scoring algorithms was evaluated in repeated sampling. Finally, scoring algorithms were applied to screener data, and these estimates were compared with those from multiple 24-h recalls in 3 external studies.Results: The DSQ and its scoring algorithms produced estimates of mean intake and prevalence that agreed closely with those from multiple 24-h recalls. The scoring algorithms were stable in repeated sampling. Differences in the means were <2%; differences in prevalence were <16%. In other studies, agreement between screener and 24-h recall estimates in fruit and vegetable intake varied. For example, among men in 2 studies, estimates from the screener were significantly lower than the 24-h recall estimates (3.2 compared with 3.8 and 3.2 compared with 4.1). In the third study, agreement between the screener and 24-h recall estimates were close among both men (3.2 compared with 3.1) and women (2.6 compared with 2.5).Conclusions: This approach to developing scoring algorithms is an advance in the use of screeners. However, because these algorithms may not be generalizable to all studies, a pilot study in the proposed study population is advisable. Although more precise instruments such as 24-h dietary recalls are recommended in most research, the NHANES DSQ provides a less burdensome alternative when time and resources are constrained and interest is in a limited set of dietary factors.