{"id":694,"date":"2019-11-09T11:44:31","date_gmt":"2019-11-09T11:44:31","guid":{"rendered":"http:\/\/guires.uk\/newsroom\/?p=694"},"modified":"2019-11-11T05:18:19","modified_gmt":"2019-11-11T05:18:19","slug":"predicting-pneumonia-predictive-modeling-using-training-dataset","status":"publish","type":"post","link":"https:\/\/guires.uk\/newsroom\/use-cases\/predicting-pneumonia-predictive-modeling-using-training-dataset\/","title":{"rendered":"Predicting Pneumonia \u2013 Predictive Modeling using Training dataset"},"content":{"rendered":"\n<h3 class=\"h3color\">The Challenges<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Pneumonia<\/strong>&nbsp;is an&nbsp;<a href=\"https:\/\/en.wikipedia.org\/wiki\/Inflammation\">inflammatory<\/a>&nbsp;condition of the&nbsp;<a href=\"https:\/\/en.wikipedia.org\/wiki\/Lung\">lung<\/a>&nbsp;affecting primarily the small air sacs known as&nbsp;<a href=\"https:\/\/en.wikipedia.org\/wiki\/Pulmonary_alveolus\">alveoli<\/a>.&nbsp;Typically symptoms include some combination of&nbsp;<a href=\"https:\/\/en.wikipedia.org\/wiki\/Phlegm\">productive<\/a>&nbsp;or dry&nbsp;<a href=\"https:\/\/en.wikipedia.org\/wiki\/Cough\">cough<\/a>,&nbsp;<a href=\"https:\/\/en.wikipedia.org\/wiki\/Chest_pain\">chest pain<\/a>,&nbsp;<a href=\"https:\/\/en.wikipedia.org\/wiki\/Fever\">fever<\/a>, and&nbsp;<a href=\"https:\/\/en.wikipedia.org\/wiki\/Dyspnea\">trouble breathing<\/a>. Severity is variable. Pneumonia is usually caused by infection with&nbsp;<a href=\"https:\/\/en.wikipedia.org\/wiki\/Virus\">viruses<\/a>&nbsp;or&nbsp;<a href=\"https:\/\/en.wikipedia.org\/wiki\/Bacteria\">bacteria<\/a>&nbsp;and less commonly by other&nbsp;<a href=\"https:\/\/en.wikipedia.org\/wiki\/Microorganism\">microorganisms<\/a>, certain&nbsp;<a href=\"https:\/\/en.wikipedia.org\/wiki\/Pharmaceutical_drug\">medications<\/a>&nbsp;and conditions such as&nbsp;<a href=\"https:\/\/en.wikipedia.org\/wiki\/Autoimmune_disease\">autoimmune diseases<\/a>.&nbsp;Risk factors include&nbsp;<a href=\"https:\/\/en.wikipedia.org\/wiki\/Cystic_fibrosis\">cystic fibrosis<\/a>,&nbsp;<a href=\"https:\/\/en.wikipedia.org\/wiki\/Chronic_obstructive_pulmonary_disease\">chronic obstructive pulmonary disease<\/a>&nbsp;(COPD),&nbsp;<a href=\"https:\/\/en.wikipedia.org\/wiki\/Asthma\">asthma<\/a>,&nbsp;<a href=\"https:\/\/en.wikipedia.org\/wiki\/Diabetes_mellitus\">diabetes<\/a>,&nbsp;<a href=\"https:\/\/en.wikipedia.org\/wiki\/Heart_failure\">heart failure<\/a>, a history of&nbsp;<a href=\"https:\/\/en.wikipedia.org\/wiki\/Smoking\">smoking<\/a>, a poor ability to cough such as following a&nbsp;<a href=\"https:\/\/en.wikipedia.org\/wiki\/Stroke\">stroke<\/a>, and a&nbsp;<a href=\"https:\/\/en.wikipedia.org\/wiki\/Immunosupressed\">weak immune system<\/a>. Diagnosis is often based on the symptoms and&nbsp;<a href=\"https:\/\/en.wikipedia.org\/wiki\/Physical_examination\">physical examination<\/a>.&nbsp;<a href=\"https:\/\/en.wikipedia.org\/wiki\/Chest_X-ray\">Chest X-ray<\/a>, blood tests, and&nbsp;<a href=\"https:\/\/en.wikipedia.org\/wiki\/Microbial_culture\">culture<\/a>&nbsp;of the&nbsp;<a href=\"https:\/\/en.wikipedia.org\/wiki\/Sputum\">sputum<\/a>&nbsp;may help confirm the diagnosis. The disease may be classified by where it was acquired with community, hospital, or healthcare-associated pneumonia. Pneumonia affects approximately 450&nbsp;million people globally (7% of the population) and results in about four million deaths per year.<\/p>\n\n\n\n<h3 class=\"h3color\">Opportunity <\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">In order to detect the presence of Pneumonia using Chest X-rays, we applied predictive modelling. Our dataset consists of more than 5000 Chest X-Ray images, Normal images are around 1341 and Pneumonia images are around 3875. Our dataset is imbalanced towards Pneumonia class. Chest X-Ray image from our dataset looks like this \u2013 <\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"694\" height=\"284\" src=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-117.png\" alt=\"\" class=\"wp-image-695\" srcset=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-117.png 694w, https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-117-300x123.png 300w\" sizes=\"auto, (max-width: 694px) 100vw, 694px\" \/><\/figure><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">In the above image we can see Normal Chest X-Ray image and on the left, we can see the pneumonia Chest X-Ray image.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"498\" height=\"370\" src=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-118.png\" alt=\"\" class=\"wp-image-696\" srcset=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-118.png 498w, https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-118-300x223.png 300w\" sizes=\"auto, (max-width: 498px) 100vw, 498px\" \/><\/figure><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">As you can see the data is\nhighly imbalanced. We have almost with thrice pneumonia cases here as compared\nto the normal cases. This situation is very normal when it comes to medical\ndata. The data will always be imbalanced, either there will be too many normal\ncases, or there will be too many cases with the disease.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Next step is to separate our dataset into training set,test set and validation set.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"326\" height=\"55\" src=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-119.png\" alt=\"\" class=\"wp-image-697\" srcset=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-119.png 326w, https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-119-300x51.png 300w, https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-119-320x55.png 320w, https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-119-321x55.png 321w\" sizes=\"auto, (max-width: 326px) 100vw, 326px\" \/><\/figure><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">As in the\nabove image, we have separated 3715 images into training set, 16 images with\ntwo classes(Pneumonia and normal) and 624&nbsp;\ntesting set images.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">After this, we will be rescaling our images and change their shape in 64*64 pixels.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"657\" height=\"287\" src=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-120.png\" alt=\"\" class=\"wp-image-698\" srcset=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-120.png 657w, https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-120-300x131.png 300w\" sizes=\"auto, (max-width: 657px) 100vw, 657px\" \/><\/figure><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">We will be\nusing CNN(Convolutional Neural Network) for this Chest X-ray image dataset.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">A Convolutional Neural Network is a special type of an Artificial\nIntelligence implementation which uses a special mathematical matrix\nmanipulation called the convolution operation to process data from the images.<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>A&nbsp;<strong>convolution<\/strong>&nbsp;does this by multiplying two\n     matrices and yielding a third, smaller matrix.<\/li><li>The Network takes an input image and uses a filter&nbsp;<strong>(or\n     kernel)<\/strong>&nbsp;to create a&nbsp;<strong>feature map<\/strong>&nbsp;describing the\n     image.<\/li><li>In the convolution operation, we take a filter (usually 2&#215;2 or 3&#215;3\n     matrix ) and&nbsp;<strong>slide<\/strong>&nbsp;it over the image matrix. The corresponding\n     numbers in both matrices are multiplied and added to yield a single number\n     describing that input space. This process is repeated all over the\n     image.This can be seen in the following animation.<\/li><li>We use different filters to pass over our inputs and take all the\n     feature maps, put them together as the final output of the convolutional\n     layer.<\/li><li>We then pass the output of this layer through a non-linear\n     activation function. The most commonly used one is ReLU.<\/li><li>The next step of our process involves further reducing the\n     dimensionality of the data which will lower the computation power required\n     for training this model. This is achieved by using a&nbsp;<strong>Pooling\n     Layer.<\/strong>&nbsp;The most commonly used one is&nbsp;<strong>max pooling<\/strong>&nbsp;which\n     takes the maximum value in the window created by a filter. This\n     significantly reduces training time and preserves significant information.<\/li><\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">Below is an example of convolution neural network architecture \u2013<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"673\" height=\"202\" src=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-121.png\" alt=\"\" class=\"wp-image-699\" srcset=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-121.png 673w, https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-121-300x90.png 300w\" sizes=\"auto, (max-width: 673px) 100vw, 673px\" \/><\/figure><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">Moving on to building a convolutional neural network, our model will be sequential, and we will be using totally two layers of CONV2d and max-pooling layers respectively to it.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"664\" height=\"242\" src=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-122.png\" alt=\"\" class=\"wp-image-700\" srcset=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-122.png 664w, https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-122-300x109.png 300w\" sizes=\"auto, (max-width: 664px) 100vw, 664px\" \/><\/figure><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">On the first\nlayer ,we will be giving input size as we changed as 64*64, and the activation\nfunction is Rectified linear unit.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Filter\nsize(3,3)<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Number of\nfilters \u2013 32<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Max pooling\nsize (2,2)<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">We will have the\nsame for the next few layers too as before, then at last sigmoid function is\nused .<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Optimizer used\nhere is \u2013 \u2018RMSPROP.\u2019<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Loss \u2013\nbinary_crossentropy<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Metrics \u2013\n\u2018Accuracy\u2019<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Activation \u2013 <\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>The rule for CNN layers <\/li><li>Sigmoid for the final layer(As our model is binary class output)<\/li><\/ul>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>CNN Summary:<\/strong><\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"649\" height=\"521\" src=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-123.png\" alt=\"\" class=\"wp-image-701\" srcset=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-123.png 649w, https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-123-300x241.png 300w, https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-123-321x257.png 321w\" sizes=\"auto, (max-width: 649px) 100vw, 649px\" \/><\/figure><\/div>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>CNN&nbsp; fit generator:<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Cnn model will fit with the training_set and epochs will be 10 , then for the validation set validation generator is used. Steps per epochs are 163, and the validation steps are 624. Code for the fit generator is given below.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"649\" height=\"114\" src=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-124.png\" alt=\"\" class=\"wp-image-702\" srcset=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-124.png 649w, https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-124-300x53.png 300w\" sizes=\"auto, (max-width: 649px) 100vw, 649px\" \/><\/figure><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">After running a fit generator, our model starts training \u2013 <\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"679\" height=\"382\" src=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-125.png\" alt=\"\" class=\"wp-image-703\" srcset=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-125.png 679w, https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-125-300x169.png 300w\" sizes=\"auto, (max-width: 679px) 100vw, 679px\" \/><\/figure><\/div>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Test Accuracy:<\/strong><\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"410\" height=\"62\" src=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-126.png\" alt=\"\" class=\"wp-image-704\" srcset=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-126.png 410w, https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-126-300x45.png 300w\" sizes=\"auto, (max-width: 410px) 100vw, 410px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">The test accuracy we got was 90%, we will now plot the graph of training and validation through epochs.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"491\" height=\"332\" src=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-127.png\" alt=\"\" class=\"wp-image-705\" srcset=\"https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-127.png 491w, https:\/\/guires.uk\/newsroom\/wp-content\/uploads\/2019\/11\/image-127-300x203.png 300w\" sizes=\"auto, (max-width: 491px) 100vw, 491px\" \/><\/figure><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">Other models\ntried:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>With more cnn layers and added dropout gave some more accuracy and did\nnot overfit the data.<\/li><li>We used different Optimizer such as ADAM and ADAGRAD, to check we can\nachieve better accuracy.<\/li><li>We have also tried tweaking each parameters of different layers , tried\ndropout methods again, used different Epochs and batch normalization.<\/li><\/ul>\n\n\n\n<h3 class=\"h3color\">Why Guires<\/h3>\n\n\n\n<ul class=\"wp-block-list\"><li>Guires\nData analytics mission is to democratize AI for healthcare industries. The team\nof data science expert use the power of AI to solve business and social\nchallenges.&nbsp; We are a pioneer in the\nresearch field for more than fifteen years and offer end to end solution for\nthe firm to set the direction for the company and support analytical frameworks\nfor better understanding and making strategic decisions. We provide appropriate\nsolutions using your existing volume of data available in varying degree of\ncomplexities that cannot be processed using traditional technologies,\nprocessing methods, or any commercial off the shelf solutions. By outsourcing\nbig data to us, we can analyze events that have happened within and outside an\norganization and correlate those to provide near accurate insights into what\ndrove the outcome. Our big data analytics solutions are fast, scalable and\npossess flexible processing.<\/li><li>We use powerful algorithms, business\nrules, and statistical models.&nbsp; We work\nwith text, image, audio, video and machine data. Our medical experts understand\nthe different layers of data being integrated and what granularity levels of\nintegration can be completed to create the holistic picture. Our team creates\nthe foundational structure for analytics and visualization of the data. Our\ndata analytics team is well equipped with advanced mathematical degrees,\nstatisticians with multiple specialist degrees who can apply cutting-edge data\nmining techniques thereby enabling our clients to gain rich insights into\nexisting customers and unearth high potential prospects.<\/li><\/ul>\n","protected":false},"excerpt":{"rendered":"<p>The Challenges Pneumonia&nbsp;is an&nbsp;inflammatory&nbsp;condition of the&nbsp;lung&nbsp;affecting primarily the small air sacs known as&nbsp;alveoli.&nbsp;Typically symptoms include some combination of&nbsp;productive&nbsp;or dry&nbsp;cough,&nbsp;chest pain,&nbsp;fever, and&nbsp;trouble breathing. Severity is variable. Pneumonia is usually caused by infection with&nbsp;viruses&nbsp;or&nbsp;bacteria&nbsp;and less commonly by other&nbsp;microorganisms, certain&nbsp;medications&nbsp;and conditions such as&nbsp;autoimmune diseases.&nbsp;Risk factors include&nbsp;cystic fibrosis,&nbsp;chronic obstructive pulmonary disease&nbsp;(COPD),&nbsp;asthma,&nbsp;diabetes,&nbsp;heart failure, a history of&nbsp;smoking, a poor ability to [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":707,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[55],"tags":[],"class_list":["post-694","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-use-cases"],"_links":{"self":[{"href":"https:\/\/guires.uk\/newsroom\/wp-json\/wp\/v2\/posts\/694","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/guires.uk\/newsroom\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/guires.uk\/newsroom\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/guires.uk\/newsroom\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/guires.uk\/newsroom\/wp-json\/wp\/v2\/comments?post=694"}],"version-history":[{"count":2,"href":"https:\/\/guires.uk\/newsroom\/wp-json\/wp\/v2\/posts\/694\/revisions"}],"predecessor-version":[{"id":708,"href":"https:\/\/guires.uk\/newsroom\/wp-json\/wp\/v2\/posts\/694\/revisions\/708"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/guires.uk\/newsroom\/wp-json\/wp\/v2\/media\/707"}],"wp:attachment":[{"href":"https:\/\/guires.uk\/newsroom\/wp-json\/wp\/v2\/media?parent=694"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/guires.uk\/newsroom\/wp-json\/wp\/v2\/categories?post=694"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/guires.uk\/newsroom\/wp-json\/wp\/v2\/tags?post=694"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}