{"id":641,"date":"2019-11-09T10:33:17","date_gmt":"2019-11-09T10:33:17","guid":{"rendered":"http:\/\/guires.uk\/newsroom\/?p=641"},"modified":"2019-11-11T05:19:03","modified_gmt":"2019-11-11T05:19:03","slug":"predicting-liver-disease-predictive-modeling-using-training-dataset","status":"publish","type":"post","link":"https:\/\/guires.uk\/newsroom\/use-cases\/predicting-liver-disease-predictive-modeling-using-training-dataset\/","title":{"rendered":"Predicting Liver Disease \u2013 Predictive Modeling using Training dataset."},"content":{"rendered":"\n<h3 class=\"h3color\">The Challenges<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Patients with the Liver disease have been continuously increasing because of excessive consumption of alcohol, inhale of harmful gases, intake of contaminated food, pickles and drugs. There are many kinds of liver diseases: Diseases caused by viruses, such as&nbsp;<a href=\"https:\/\/medlineplus.gov\/hepatitisa.html\">hepatitis A<\/a>,&nbsp;<a href=\"https:\/\/medlineplus.gov\/hepatitisb.html\">hepatitis B<\/a>, and&nbsp;<a href=\"https:\/\/medlineplus.gov\/hepatitisc.html\">hepatitis C<\/a>, Diseases caused by drugs, poisons, or too much alcohol. Examples include&nbsp;<a href=\"https:\/\/medlineplus.gov\/fattyliverdisease.html\">fatty liver disease<\/a>&nbsp;and&nbsp;<a href=\"https:\/\/medlineplus.gov\/cirrhosis.html\">cirrhosis<\/a>, <a href=\"https:\/\/medlineplus.gov\/livercancer.html\">Liver cancer<\/a>, Inherited diseases, such as&nbsp;<a href=\"https:\/\/medlineplus.gov\/hemochromatosis.html\">hemochromatosis<\/a>&nbsp;and&nbsp;<a href=\"https:\/\/medlineplus.gov\/wilsondisease.html\">Wilson disease<\/a>. Obesity is also associated with liver damage. Over time, damage to the liver results in scarring (cirrhosis), which can lead to liver failure, a life-threatening condition. But how do we identify these patients? We can use predictive modelling from data science to help prioritize patients.<\/p>\n\n\n\n<h3 class=\"h3color\">Opportunity<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">This dataset was used to evaluate prediction\nalgorithms in an effort to reduce the burden on doctors. <strong>The objective is t<\/strong>o predict if a patient is suffering from liver\ndisease or not. Data which we had used is from UCI Machine learning repository.\nThis data set contains 416 liver patient records and 167 non-liver patient\nrecords collected from North East of Andhra Pradesh, India. The\n&#8220;Dataset&#8221; column is a class label used to divide groups into liver\npatient (liver disease) or not (no disease). This data set contains 441 male\npatient records and 142 female patient records.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Columns:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Age of the patient<\/li><li>Gender of the patient<\/li><li>Total Bilirubin<\/li><li>Direct Bilirubin<\/li><li>Alkaline Phosphatase<\/li><li>Alamine Aminotransferase<\/li><li>Aspartate Aminotransferase<\/li><li>Total Proteins<\/li><li>Albumin<\/li><li>Albumin and Globulin Ratio<\/li><li>Dataset: field used to split the data into two sets (patient with liver\ndisease, or no disease)<\/li><\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">First, five rows of data look like this \u2013<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"677\" height=\"112\" src=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-69.png\" alt=\"\" class=\"wp-image-642\" srcset=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-69.png 677w, https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-69-300x50.png 300w\" sizes=\"auto, (max-width: 677px) 100vw, 677px\" \/><\/figure><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">Gender distribution of this dataset is \u2013<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"523\" height=\"355\" src=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-70.png\" alt=\"\" class=\"wp-image-643\" srcset=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-70.png 523w, https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-70-300x204.png 300w\" sizes=\"auto, (max-width: 523px) 100vw, 523px\" \/><\/figure><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">And our target class distribution is \u2013<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"523\" height=\"355\" src=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-71.png\" alt=\"\" class=\"wp-image-644\" srcset=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-71.png 523w, https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-71-300x204.png 300w\" sizes=\"auto, (max-width: 523px) 100vw, 523px\" \/><\/figure><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">Let\u2019s now dive deep into other features in the\ndataset,<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Alkaline phosphate- mostly the features are between 0-500 range.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"356\" height=\"245\" src=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-72.png\" alt=\"\" class=\"wp-image-645\" srcset=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-72.png 356w, https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-72-300x206.png 300w\" sizes=\"auto, (max-width: 356px) 100vw, 356px\" \/><\/figure><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">Direct_Bilirubin \u2013 this feature has more 0,1,2,3s<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"385\" height=\"274\" src=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-73.png\" alt=\"\" class=\"wp-image-646\" srcset=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-73.png 385w, https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-73-300x214.png 300w\" sizes=\"auto, (max-width: 385px) 100vw, 385px\" \/><\/figure><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">Now we will check multivariate analysis on Total_Bilirubin and Direct_Bilirubin \u2013 these features are plotted using seaborn in joint plot. We can see that they &nbsp;almost have a similar distribution.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"391\" height=\"394\" src=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-74.png\" alt=\"\" class=\"wp-image-647\" srcset=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-74.png 391w, https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-74-150x150.png 150w, https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-74-298x300.png 298w\" sizes=\"auto, (max-width: 391px) 100vw, 391px\" \/><\/figure><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">Multivariate analysis on Alkaline_Phosphotase and Alamine_Aminotransferase-<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"385\" height=\"376\" src=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-75.png\" alt=\"\" class=\"wp-image-648\" srcset=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-75.png 385w, https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-75-300x293.png 300w\" sizes=\"auto, (max-width: 385px) 100vw, 385px\" \/><\/figure><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">These\nfeatures are not distributed normally, they are widespread across alkaline\nphosphate, and alphamine aminotransferase is mostly among 0-250 and few other\noutliers.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Now we will see protein and albumin with our output class<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"488\" height=\"467\" src=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-76.png\" alt=\"\" class=\"wp-image-649\" srcset=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-76.png 488w, https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-76-300x287.png 300w\" sizes=\"auto, (max-width: 488px) 100vw, 488px\" \/><\/figure><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">This is done using the factor plot\nfrom seaborn package in python. We can see that albumin in male with total\nproteins has the liver disease than female.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Total_Protiens and Albumin- Protein has mostly ranged from 5-9 and albumin mostly ranges from 2-5 , but together they correspond well towards &nbsp;correlation.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"364\" height=\"372\" src=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-77.png\" alt=\"\" class=\"wp-image-650\" srcset=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-77.png 364w, https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-77-294x300.png 294w\" sizes=\"auto, (max-width: 364px) 100vw, 364px\" \/><\/figure><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">Null elements:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">There are only 4 null elements in a single\nfeature.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"264\" height=\"254\" src=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-78.png\" alt=\"\" class=\"wp-image-651\"\/><\/figure><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">Let\u2019s see the correlation between all the features to understand how close the features are \u2013<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"638\" height=\"601\" src=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-79.png\" alt=\"\" class=\"wp-image-653\" srcset=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-79.png 638w, https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-79-300x283.png 300w\" sizes=\"auto, (max-width: 638px) 100vw, 638px\" \/><\/figure><\/div>\n\n\n\n<p class=\"wp-block-paragraph\"> Removing null elements from the dataset-  <\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"677\" height=\"113\" src=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-80.png\" alt=\"\" class=\"wp-image-654\" srcset=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-80.png 677w, https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-80-300x50.png 300w\" sizes=\"auto, (max-width: 677px) 100vw, 677px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Standardisation<\/strong>:&nbsp;&nbsp;&nbsp;&nbsp; this is an important step\nbefore modelling as the features are needed to be in a particular range.\nConsider this example, if feature age 50 and insulin 0.5 is fed into ML\nalgorithm the machine thinks that age is more important as it has higher value.\nFor this reason we need to normalize the dataset leaving out the target column.\nAfter completing this standardisation process the dataset looks like this- <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">We are doing this before visuals and modeling so that the features get normalize, and this is achieved by using Standard scalar. Below are the few lines of normalized features \u2013<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"623\" height=\"174\" src=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-81.png\" alt=\"\" class=\"wp-image-655\" srcset=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-81.png 623w, https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-81-300x84.png 300w\" sizes=\"auto, (max-width: 623px) 100vw, 623px\" \/><\/figure><\/div>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Predictive modelling:<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">We will be using different machine learning algorithms such as\nSVM(Support Vector Machine), Logistic Regression , Random Forest classifier,\nDecision tree, KNN(K- Nearest Neighbours) and MLP(Multilayer Perceptron).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>SVM:&nbsp;&nbsp; <\/strong>we will be using support vector machine from sklearn python package. There are two kernels in svm, namely Linear and RBF , we will be testing both the algorithms.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"680\" height=\"299\" src=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-82.png\" alt=\"\" class=\"wp-image-656\" srcset=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-82.png 680w, https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-82-300x132.png 300w\" sizes=\"auto, (max-width: 680px) 100vw, 680px\" \/><\/figure><\/div>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Logistic Regression: <\/strong>this algorithm is also from sklearn python package, here the main hyperparameters are C and Penalty(L1 or L2).<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"714\" height=\"196\" src=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-83.png\" alt=\"\" class=\"wp-image-657\" srcset=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-83.png 714w, https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-83-300x82.png 300w\" sizes=\"auto, (max-width: 714px) 100vw, 714px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Random Forest<\/strong>: this algorithm is a type of ensemble model which works well for classification problems. Let us first find the important features and then predict the liver disease.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"77\" src=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-84.png\" alt=\"\" class=\"wp-image-658\" srcset=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-84.png 624w, https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-84-300x37.png 300w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/figure><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">The important feature according random forests are-<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"324\" height=\"228\" src=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-85.png\" alt=\"\" class=\"wp-image-659\" srcset=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-85.png 324w, https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-85-300x211.png 300w, https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-85-320x224.png 320w\" sizes=\"auto, (max-width: 324px) 100vw, 324px\" \/><\/figure><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">Let us see the accuracy of these features first-<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"621\" height=\"152\" src=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-86.png\" alt=\"\" class=\"wp-image-660\" srcset=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-86.png 621w, https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-86-300x73.png 300w\" sizes=\"auto, (max-width: 621px) 100vw, 621px\" \/><\/figure><\/div>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>KNN: <\/strong>K- nearest neighbour is a machine learning algorithm which uses distance\nmetrics to find the closest neighbours of our features, we need to find the\nvalue of k to get the best accuracy at a particular value.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">We will be testing with many numbers of k- neighbours,, so that we can\nfind the best amount for k to achieve better accuracy.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">For this dataset let\u2019s set the value of k as 20.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"692\" height=\"347\" src=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-87.png\" alt=\"\" class=\"wp-image-661\" srcset=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-87.png 692w, https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-87-300x150.png 300w\" sizes=\"auto, (max-width: 692px) 100vw, 692px\" \/><\/figure><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">Accuracy does not seem to get higher than 66%, so we will stop with 20\nnearest neighbours and 20 th value for k got 65%.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Adaboost classifier<\/strong>: Let us now try Ensemble model (which takes Decision tree classifier as its base algorithm and same hyperparameters as decision tree is set)<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"687\" height=\"227\" src=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-88.png\" alt=\"\" class=\"wp-image-662\" srcset=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-88.png 687w, https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-88-300x99.png 300w\" sizes=\"auto, (max-width: 687px) 100vw, 687px\" \/><\/figure><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">This model\ndoesn\u2019t get much accuracy than others.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Next, we will be trying <strong>MULTI-LAYER PERCEPTRON<\/strong> -with different iteration of different layers and learning rate.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"685\" height=\"256\" src=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-89.png\" alt=\"\" class=\"wp-image-663\" srcset=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-89.png 685w, https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-89-300x112.png 300w\" sizes=\"auto, (max-width: 685px) 100vw, 685px\" \/><\/figure><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">As most of the\nalgorithms are not giving good accuracy lets, try out cross-validation with all\nthe algorithms and also we will try out grid search evaluation method.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Cross-validation<\/strong><strong>:<\/strong> Cross-validation\nis a resampling procedure used to evaluate machine learning models on a limited\ndata sample.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The general procedure is as follows:<\/p>\n\n\n\n<ol class=\"wp-block-list\"><li>Shuffle the dataset randomly.<\/li><li>Split the dataset into k groups.<\/li><li>For each unique group:<\/li><\/ol>\n\n\n\n<ol class=\"wp-block-list\"><li>Take the group as a holdout or test data set.<\/li><li>Take the remaining groups as a training data set.<\/li><li>Fit a model on the training set and evaluate it on the test set.<\/li><li>Retain the evaluation score and discard the model.<\/li><li>Summarize the skill of the model using the sample of model evaluation\nscores.<\/li><\/ol>\n\n\n\n<p class=\"wp-block-paragraph\">Now we will be using this cross-validation for our algorithms and check which gives us better accuracy \u2013 let K be 10<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"237\" height=\"227\" src=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-90.png\" alt=\"\" class=\"wp-image-664\"\/><\/figure><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">We can see that linear SVM performed well\nthan other algorithms , let us also try out the grid search cv.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">These will be the parameters of the random forest which we will be using on Random forest classifier.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"540\" height=\"147\" src=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-91.png\" alt=\"\" class=\"wp-image-665\" srcset=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-91.png 540w, https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-91-300x82.png 300w\" sizes=\"auto, (max-width: 540px) 100vw, 540px\" \/><\/figure><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">We are getting the best accuracy with random forest on cross-validation as ~73%.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"655\" height=\"135\" src=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-92.png\" alt=\"\" class=\"wp-image-666\" srcset=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-92.png 655w, https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-92-300x62.png 300w\" sizes=\"auto, (max-width: 655px) 100vw, 655px\" \/><\/figure><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">Classification Report-<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"355\" height=\"145\" src=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-93.png\" alt=\"\" class=\"wp-image-667\" srcset=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-93.png 355w, https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-93-300x123.png 300w\" sizes=\"auto, (max-width: 355px) 100vw, 355px\" \/><\/figure><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">Confusion Matrix-<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"488\" height=\"371\" src=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-94.png\" alt=\"\" class=\"wp-image-668\" srcset=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-94.png 488w, https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-94-300x228.png 300w\" sizes=\"auto, (max-width: 488px) 100vw, 488px\" \/><\/figure><\/div>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>AUC \u2013 ROC curve: <\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">AUC\u2013ROC curve is the model selection metric for bi\u2013multi-class classification problem. ROC is a probability curve for different classes. ROC tells us how good the model is for distinguishing the given classes, in terms of the predicted probability. A typical ROC curve has False Positive Rate (FPR) on the X-axis and True Positive Rate (TPR) on the Y-axis. The area covered by the curve is the area between the orange line (ROC) and the axis. This area covered is AUC. The bigger the area covered, the better the machine learning models is at distinguishing the given classes. The ideal value for AUC is 1.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"419\" height=\"304\" src=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-95.png\" alt=\"\" class=\"wp-image-669\" srcset=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-95.png 419w, https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-95-300x218.png 300w\" sizes=\"auto, (max-width: 419px) 100vw, 419px\" \/><\/figure><\/div>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Further\nProceedings:<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>The dataset contains\nthe details of diabetes liver details, and we can achieve better accuracy with\nthe help of CT images of livers to make better impact inaccuracy.<ul><li>As this dataset is so\ncrucial and important in the science industry, there should be more\ncontributors to the dataset, which leads in better ml model.<\/li><\/ul><ul><li>With Image dataset of the\nliver, we can use various CNN, Inception V3 and other better algorithms to\nachieve better accuracy.<\/li><\/ul><\/li><\/ul>\n\n\n\n<h3 class=\"h3color\">Why Guires<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Guires\nData analytics mission is to democratize AI for healthcare industries. The team\nof data science expert use the power of AI to solve business and social\nchallenges.&nbsp; We are a pioneer in the\nresearch field for more than fifteen years and offer end to end solution for\nthe firm to set the direction for the company and support analytical frameworks\nfor better understanding and making strategic decisions. We provide appropriate\nsolutions using your existing volume of data available in varying degree of\ncomplexities that cannot be processed using traditional technologies,\nprocessing methods, or any commercial off the shelf solutions. By outsourcing\nbig data to us, we can analyze events that have happened within and outside an\norganization and correlate those to provide near accurate insights into what\ndrove the outcome. Our big data analytics solutions are fast, scalable and\npossess flexible processing.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">We use powerful algorithms, business rules, and statistical models.&nbsp; We work with text, image, audio, video and\nmachine data. Our medical experts understand the different layers of data being\nintegrated and what granularity levels of integration can be completed to\ncreate the holistic picture. Our team creates the foundational structure for\nanalytics and visualization of the data. Our data analytics team is well\nequipped with advanced mathematical degrees, statisticians with multiple\nspecialist degrees who can apply cutting-edge data mining techniques thereby\nenabling our clients to gain rich insights into existing customers and unearth\nhigh potential prospects.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">How\ncan you make the most of predictive analytics? Let us help you get started.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Get\npredictive analytics working for you. Contact Guires expert.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">How\ncan you make the most of predictive analytics? Let us help you get started.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Get\npredictive analytics working for you. Contact Guires expert.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The Challenges Patients with the Liver disease have been continuously increasing because of excessive consumption of alcohol, inhale of harmful gases, intake of contaminated food, pickles and drugs. There are many kinds of liver diseases: Diseases caused by viruses, such as&nbsp;hepatitis A,&nbsp;hepatitis B, and&nbsp;hepatitis C, Diseases caused by drugs, poisons, or too much alcohol. Examples [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":711,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[55],"tags":[],"class_list":["post-641","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-use-cases"],"_links":{"self":[{"href":"https:\/\/guires.uk\/newsroom\/wp-json\/wp\/v2\/posts\/641","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/guires.uk\/newsroom\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/guires.uk\/newsroom\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/guires.uk\/newsroom\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/guires.uk\/newsroom\/wp-json\/wp\/v2\/comments?post=641"}],"version-history":[{"count":3,"href":"https:\/\/guires.uk\/newsroom\/wp-json\/wp\/v2\/posts\/641\/revisions"}],"predecessor-version":[{"id":712,"href":"https:\/\/guires.uk\/newsroom\/wp-json\/wp\/v2\/posts\/641\/revisions\/712"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/guires.uk\/newsroom\/wp-json\/wp\/v2\/media\/711"}],"wp:attachment":[{"href":"https:\/\/guires.uk\/newsroom\/wp-json\/wp\/v2\/media?parent=641"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/guires.uk\/newsroom\/wp-json\/wp\/v2\/categories?post=641"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/guires.uk\/newsroom\/wp-json\/wp\/v2\/tags?post=641"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}