CNN seems to be overfitting

Question

CNN seems to be overfitting

62 views Asked by Vinicius Cavalcante At 30 July 2023 at 22:02

I'm trying to build a CNN capable of detecting COVID-19 through chest x rays. I'm using this kaggle dataset. It has, more or less, 27k images, I'm only using COVID and NORMAL ones.

I first started following keras image classification tutorial, and after some twerks I have something like that:

batch_size = 16
img_height = 160
img_width = 160
img_size = (img_height, img_width)

seed_train_validation = 1
shuffle_value = True
validation_split = 0.3

train_ds = tf.keras.utils.image_dataset_from_directory(
    data_dir,
    image_size = img_size,
    validation_split = validation_split,
    subset = "training",
    seed = seed_train_validation,
    color_mode = "grayscale",
    shuffle = shuffle_value
)

val_ds = tf.keras.utils.image_dataset_from_directory(
    data_dir,
    image_size = img_size,
    validation_split = validation_split,
    subset = "validation",
    seed = seed_train_validation,
    color_mode = "grayscale",
    shuffle = shuffle_value
)

val_batches = tf.data.experimental.cardinality(val_ds)
test_ds = val_ds.take((2*val_batches) // 3)
val_ds = val_ds.skip((2*val_batches) // 3)

AUTOTUNE = tf.data.AUTOTUNE
train_ds = train_ds.cache().shuffle(1000).prefetch(buffer_size=AUTOTUNE)
val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE)

resize_and_rescale = tf.keras.Sequential([
  layers.Resizing(img_height, img_width),
  layers.Rescaling(1./255)
])
data_augmentation = tf.keras.Sequential([
  layers.RandomFlip("horizontal_and_vertical"),
  layers.RandomRotation(0.2),
  layers.RandomZoom(0.1)
])

num_classes = len(class_names)

model_1 = Sequential([
    resize_and_rescale,
    data_augmentation,
    layers.Conv2D(16, 3, padding='same', activation='relu'),
    layers.MaxPooling2D(),
    layers.Conv2D(32, 3, padding='same', activation='relu'),
    layers.MaxPooling2D(),
    layers.Conv2D(64, 3, padding='same', activation='relu'),
    layers.MaxPooling2D(),
    layers.Dropout(0.2),
    layers.Flatten(),
    layers.Dense(128, activation='relu'),
    layers.Dense(num_classes)
])
model_1.compile(optimizer="adam",
                loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
                metrics=['accuracy'])
epochs = 75
history = model_1.fit(
    train_ds,
    validation_data = val_ds,
    epochs = epochs
)

If I train for less epochs, let's say 10, when I plot accuracy and loss graps, I got a good exponential curve, howerver, if I increase the number of epochs, I got some weird graphs like these below:
Resultf after training for 75 epochs

I have already introduced data augmentation and a dropout layer, but I dont get better results no matter what. Any tips?

It seems that my model is overfitting, but I dont have much experience to conclude that for sure. However, I read that adding data augmentation and a dropout layer seems to work for most people, but that doesnt seem to be my case.

Original Q&A

There are 1 answers

**Vinicius Cavalcante** · Answer 1 · 2023-08-02T03:03:10+00:00

After some more iterations, I guess I figured out the main problem.

My dataset directory structure was something like this:

MainDir:

Class 0: 1.1. Images; 1.2. Masks.
Class 1: 2.1. Images; 2.2. Masks.

So, when I ran the image_from_directory method, it was gathering all images, masks included, and feeding it to my model, thus it was pretty hard to find patterns betweens images and masks. After removing masks entirely, my model seems to be working just fine.

TechQA.

CNN seems to be overfitting

There are 1 answers

Related Questions in PYTHON

Related Questions in MACHINE-LEARNING

Related Questions in KERAS

Related Questions in CONV-NEURAL-NETWORK

Related Questions in OVERFITTING-UNDERFITTING

Popular Questions

Trending Questions